[
https://issues.apache.org/jira/browse/KYLIN-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383469#comment-15383469
]
Zhong Yanghong commented on KYLIN-1883:
---------------------------------------
No, it's still not working for our environment. Since the TIME_THREADSHOLD is
for the time the DICT_A created, this case always happen for streaming with a
retention days setting. The segment and the related DICT are created before
more than 10 days. By changing the workflow, we can reduce the chance of this
kind of consensus problem happening.
> Consensus Problem when running the tool, MetadataCleanupJob
> -----------------------------------------------------------
>
> Key: KYLIN-1883
> URL: https://issues.apache.org/jira/browse/KYLIN-1883
> Project: Kylin
> Issue Type: Bug
> Reporter: Zhong Yanghong
> Assignee: Zhong Yanghong
> Attachments:
> better_solution_for_consensus_issue_of_MetadataCleanupJob.patch
>
>
> When do the cleanup, current strategy is as follows:
> 1. firstly create an referenceSet
> 2. then add items not belonging to the referenceSet to the toDeleteSet
> 3. finally delete those items in the toDeleteSet
> Consensus issue will occur since we cannot make sure that all of the items in
> toDeleteSet are not referenced in case that referenceSet changes during the
> process.
> For example, before the cleanup, SEGMENT_A is deleted and leave a DICT_A
> created at the building step. Then the referenceSet will not include DICT_A.
> After creating the reference set, SEGMENT_B is starting to build. Since
> DICT_A still exists, it can still be referenced by SEGMENT_B. Then DICT_A
> will still be included in the toDeleteSet and will be deleted later. Finally
> SEGMENT_B only owns a reference with no data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)