[
https://issues.apache.org/jira/browse/KYLIN-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836484#comment-17836484
]
ASF subversion and git services commented on KYLIN-5828:
--------------------------------------------------------
Commit c61dc4189968f29f8e56e98f72a16f100e9d6e2b in kylin's branch
refs/heads/kylin5 from huangsheng
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=c61dc41899 ]
KYLIN-5828 Concurrently dict v2 jobs lead to abnormal encoding result
> During multi-jobs concurrent building, the flat table may use inconsistent
> global dictionaries, resulting in incorrect count distinct query results.
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: KYLIN-5828
> URL: https://issues.apache.org/jira/browse/KYLIN-5828
> Project: Kylin
> Issue Type: Bug
> Components: Storage - Parquet
> Reporter: Zhimin Wu
> Assignee: Zhimin Wu
> Priority: Major
>
> *Root Cause*
> When multiple tasks are concurrently building and using the same global
> dictionary, the consistency of the dictionary version used in the flat table
> encoding process is not guaranteed. At the same time, another task expands
> the dictionary, causing some flat table partitions to mistakenly use the new
> version of the dictionary partition file. Due to the inconsistent data
> distribution, the correct dictionary content cannot be obtained, resulting in
> a flat table encoding column of 0 and ultimately causing an abnormal count
> distinct value.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)