[
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368007#comment-17368007
]
Zhong Yanghong commented on KYLIN-4165:
---------------------------------------
Why we need a cube-level lock for this, since the dictionary is segment-level.
> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> -----------------------------------------------------------------------
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
> Issue Type: Bug
> Components: Real-time Streaming
> Affects Versions: v3.0.0-alpha
> Reporter: wangxiaojing
> Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when
> build the realtime fsegment from remote persisted to reday,Which is very
> serious,it will lead to unsuccessful updating of dictionaries by multiple
> jobs concurrently.This may occurs when a cube has many concurrent building
> jobs one the same step ——”Save Cube Dictionaries“ .
> Perhaps a globally distributed lock is needed to avoid one cube concurrency
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting
> conflict
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
> expect old TS 1568012509090, but it is 1568012509245 at
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)