Hi there,
  We noticed a segment refresh job at ERROR status at "*2019-12-10 02:27:15*
". After a deep dive into kylin.log, we found that the
WriteConflictException was raised when trying to update the last modified
time for the dict.

(Sensitive content are consered)

2019-12-10 02:27:15,576 ERROR [Scheduler 1186524190 Job
> f76fb12e-477f-b404-e021-adaf520d04bb-321] common.HadoopShellExecutable:65 :
> error execute
> HadoopShellExecutable{id=f76fb12e-477f-b404-e021-adaf520d04bb-03,
> name=Build Dimension Dictionary, state=RUNNING}
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting
> conflict
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict,
> expect old TS 1575916011421, but it is 1575916035573
> at
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at
> org.apache.kylin.common.persistence.ResourceStore.lambda$updateTimestampWithRetry$4(ResourceStore.java:443)
> at
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:442)
> at
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:437)
> at
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:432)
> at
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at
> org.apache.kylin.dict.DictionaryManager.saveDictionary(DictionaryManager.java:339)
> at
> org.apache.kylin.cube.CubeManager$DictionaryAssist.saveDictionary(CubeManager.java:1145)
> at org.apache.kylin.cube.CubeManager.saveDictionary(CubeManager.java:1107)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:100)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:69)
> at
> org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)
> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:93)
> at
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>

We have a glance at the codebase (Kylin-2.6.4), we found the last modified
time of the dict (which has been built) will be updated at the step "Build
Dimension Dictionary". The log blow shows the same dict is processed many
times.

Liangs-MacBook-Pro:~ pwrliang$ cat 12-10.log |grep "has already been built,
> save it"|grep "THIS_IS_COLUMN_NAME"
> 2019-12-10 02:07:10,103 DEBUG [Scheduler 1186524190 Job
> 5c4b2ee9-c8aa-9a69-ffb0-3a6811ad22a9-372] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:07:11,036 DEBUG [Scheduler 1186524190 Job
> 593979b8-e80b-590f-335b-526c2fc49080-295] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:07:20,455 DEBUG [Scheduler 1186524190 Job
> 160952f3-fef2-4fdf-c08a-77d49f830805-246] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:07:20,888 DEBUG [Scheduler 1186524190 Job
> 69304c1c-cc02-3bbe-67d4-efdd3858e88a-234] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:07:21,634 DEBUG [Scheduler 1186524190 Job
> 345b688c-7992-f2cf-ae06-835675f545ea-117] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:07:23,040 DEBUG [Scheduler 1186524190 Job
> 0ea711ed-71b5-ae47-a34a-f0a27fd9d7cc-303] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:07:36,358 DEBUG [Scheduler 1186524190 Job
> 6daf351f-5073-128f-8754-f499e345db54-308] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:26:43,716 DEBUG [Scheduler 1186524190 Job
> c90a765c-976f-49c2-38d8-e6812d091cfb-295] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:26:44,944 DEBUG [Scheduler 1186524190 Job
> cd108004-12c6-078f-8646-f29841de9e40-205] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:26:51,306 DEBUG [Scheduler 1186524190 Job
> d808dadf-8eea-34db-1b1f-d50e923b4e3a-372] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:27:15,523 DEBUG [Scheduler 1186524190 Job
> ea07ef8f-b2b7-b3f8-c481-df615025217d-117] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:27:15,535 DEBUG [Scheduler 1186524190 Job
> f76fb12e-477f-b404-e021-adaf520d04bb-321] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
> 2019-12-10 02:29:42,504 DEBUG [Scheduler 1186524190 Job
> f894f427-b170-19b6-76fb-e1b671659771-321] cli.DictionaryGeneratorCLI:99 :
> Dict for 'THIS_IS_COLUMN_NAME' has already been built, save it
>

Since we issue the refresh segments request concurrently (with REST APIs),
the dict has a chance to be modified concurrently. Kylin uses checkAndPut
to update HBase to prevent race-condition, so the WriteConflictException
may be raised when updating the same resource.

Liangs-MacBook-Pro:~ pwrliang$ cat 12-10.log |grep
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict|grep
> "Update dictionary "
> 2019-12-10 02:07:10,223 INFO  [Scheduler 1186524190 Job
> 5c4b2ee9-c8aa-9a69-ffb0-3a6811ad22a9-372] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575914830203
> 2019-12-10 02:07:11,109 INFO  [Scheduler 1186524190 Job
> 593979b8-e80b-590f-335b-526c2fc49080-295] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575914831090
> 2019-12-10 02:07:20,554 INFO  [Scheduler 1186524190 Job
> 160952f3-fef2-4fdf-c08a-77d49f830805-246] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575914840528
> 2019-12-10 02:07:21,762 INFO  [Scheduler 1186524190 Job
> 345b688c-7992-f2cf-ae06-835675f545ea-117] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575914841727
> 2019-12-10 02:07:23,253 INFO  [Scheduler 1186524190 Job
> 0ea711ed-71b5-ae47-a34a-f0a27fd9d7cc-303] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575914843095
> 2019-12-10 02:26:43,843 INFO  [Scheduler 1186524190 Job
> c90a765c-976f-49c2-38d8-e6812d091cfb-295] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575916003771
> 2019-12-10 02:26:51,436 INFO  [Scheduler 1186524190 Job
> d808dadf-8eea-34db-1b1f-d50e923b4e3a-372] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575916011421
> 2019-12-10 02:27:15,593 INFO  [Scheduler 1186524190 Job
> ea07ef8f-b2b7-b3f8-c481-df615025217d-117] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575916035573
> 2019-12-10 02:29:42,564 INFO  [Scheduler 1186524190 Job
> f894f427-b170-19b6-76fb-e1b671659771-321] dict.DictionaryManager:198 :
> Update dictionary
> /dict/THIS_IS_CUBE_NAME/THIS_IS_COLUMN_NAME/010dbf72-52a9-e759-08a7-ed7cde0c6e0d.dict
> lastModifiedTime to 1575916182549
>

The log above shown that the job "f76fb12e-477f-b404-e021-adaf520d04bb" has
not been print out the log - "Update dictionary blablabla...." because of
the WriteConflictException.

The question is: refreshing segments concurrently is supported for Kylin
2.6.4? Concurrently refreshing is very important for us, since we want to
reduce the building time.

Regards,
Liang

Reply via email to