[jira] [Updated] (KYLIN-2800) All dictionaries should be built based on the flat hive table

zhengdong (JIRA) Mon, 21 Aug 2017 03:41:19 -0700

     [ 
https://issues.apache.org/jira/browse/KYLIN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


zhengdong updated KYLIN-2800:
-----------------------------
    Attachment: 0001-KYLIN-2800-All-dictionaries-should-be-built-based-on.patch

> All dictionaries should be built based on the flat hive table
> -------------------------------------------------------------
>
>                 Key: KYLIN-2800
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2800
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: zhengdong
>         Attachments: 
> 0001-KYLIN-2800-All-dictionaries-should-be-built-based-on.patch
>
>
> After KYLIN-2457, we still got wrong query result sometimes after a merging 
> job finished. 
> Finally, we realize the root cause is that we always use lookup table as 
> source data to build dictionaries for FK columns. 
> However, incremental lookup table doesn't mean sequential and incremental PK. 
> If a new record inserted into the lookup table while its PK column does not 
> have the max value, ID numbers in the new dictionary could be changed for 
> those PK value larger than the newest one.
> What's more, using lookup table as source data for FK column's dictionary may 
> has performance advantage for merging job, but also may encounter too big 
> dictionary problem for large lookup tables. And we must add some validation 
> rules to ensure the PK value sequential and incremental.
> On the another hand, we could just unify using the flat hive table as data 
> source for all dictionaries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (KYLIN-2800) All dictionaries should be built based on the flat hive table

Reply via email to