[ 
https://issues.apache.org/jira/browse/KYLIN-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836484#comment-17836484
 ] 

ASF subversion and git services commented on KYLIN-5828:
--------------------------------------------------------

Commit c61dc4189968f29f8e56e98f72a16f100e9d6e2b in kylin's branch 
refs/heads/kylin5 from huangsheng
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=c61dc41899 ]

KYLIN-5828 Concurrently dict v2 jobs lead to abnormal encoding result


> During multi-jobs concurrent building, the flat table may use inconsistent 
> global dictionaries, resulting in incorrect count distinct query results.
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-5828
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5828
>             Project: Kylin
>          Issue Type: Bug
>          Components: Storage - Parquet
>            Reporter: Zhimin Wu
>            Assignee: Zhimin Wu
>            Priority: Major
>
> *Root Cause*
> When multiple tasks are concurrently building and using the same global 
> dictionary, the consistency of the dictionary version used in the flat table 
> encoding process is not guaranteed. At the same time, another task expands 
> the dictionary, causing some flat table partitions to mistakenly use the new 
> version of the dictionary partition file. Due to the inconsistent data 
> distribution, the correct dictionary content cannot be obtained, resulting in 
> a flat table encoding column of 0 and ultimately causing an abnormal count 
> distinct value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to