[
https://issues.apache.org/jira/browse/KYLIN-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866615#comment-16866615
]
ASF subversion and git services commented on KYLIN-4017:
--------------------------------------------------------
Commit a74dc055a163e6adb0269e0924fbc78e8f997db2 in kylin's branch
refs/heads/master-hadoop3.1 from wangxiaojing
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=a74dc05 ]
KYLIN-4017 Build engine get zk(zookeeper) lock failed when building job, this
causes the whole build engine doesn't work.
> Build engine get zk(zookeeper) lock failed when building job, it causes the
> whole build engine doesn't work.
> ------------------------------------------------------------------------------------------------------------
>
> Key: KYLIN-4017
> URL: https://issues.apache.org/jira/browse/KYLIN-4017
> Project: Kylin
> Issue Type: Bug
> Components: Job Engine, Tools, Build and Test
> Affects Versions: Future, v3.0.0, v3.0.0-alpha
> Reporter: wangxiaojing
> Priority: Critical
> Labels: build
> Fix For: Future, v3.0.0-alpha
>
> Attachments: zkinstancestart.png
>
>
> Kylin has ZK acquisition lock exception when it is building job. Only restart
> can solve this problem. Otherwise, it can't build job ,the whole build engine
> doesn't work.This problem will continue to occur one day after restart. Log
> looks like below:
> {code:java}
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57]
> threadpool.FetcherRunner:59 :
> CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE -
> es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00
> 2019-05-15 11:03:15, state=READY} prepare to schedule and its priority is 20
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57]
> threadpool.FetcherRunner:63 :
> CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE -
> es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00
> 2019-05-15 11:03:15, state=READY} scheduled
> 2019-05-15 11:09:43,209 DEBUG [Scheduler 719764581 Job
> 878974c4-4c65-88a4-a912-b238fcc33bdc-132]
> zookeeper.ZookeeperDistributedLock:92 :
> [email protected] trying to lock
> /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
> 2019-05-15 11:09:43,212 ERROR [pool-12-thread-10]
> threadpool.DistributedScheduler:115 : unknown error execute
> job:878974c4-4c65-88a4-a912-b238fcc33bdc in server:
> [email protected]
> java.lang.IllegalStateException: Error while
> [email protected] trying to lock
> /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
> at
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:99)
> at
> org.apache.kylin.job.lock.zookeeper.ZookeeperJobLock.lock(ZookeeperJobLock.java:41)
> at
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:105)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: instance must be started before
> calling this method
> at
> org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:176)
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.create(CuratorFrameworkImpl.java:351)
> at
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:95)
> ... 5 more{code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)