[jira] [Created] (KYLIN-2800) All dictionaries should be built based on the flat hive table

zhengdong (JIRA) Mon, 21 Aug 2017 03:35:10 -0700

zhengdong created KYLIN-2800:
--------------------------------

             Summary: All dictionaries should be built based on the flat hive 
table
                 Key: KYLIN-2800
                 URL: https://issues.apache.org/jira/browse/KYLIN-2800
             Project: Kylin
          Issue Type: Bug
            Reporter: zhengdong



After KYLIN-2457, we still got wrong query result sometimes after a merging job 
finished. 
Finally, we realize the root cause is that we always use lookup table as source 
data to build dictionaries for FK columns. 
However, incremental lookup table doesn't mean sequential and incremental PK. 
If a new record inserted into the lookup table while its PK column does not 
have the max value, ID numbers in the new dictionary could be changed for those 
PK value larger than the newest one.
What's more, using lookup table as source data for FK column's dictionary may 
has performance advantage for merging job, but also may encounter too big 
dictionary problem for large lookup tables. And we must add some validation 
rules to ensure the PK value sequential and incremental.
On the another hand, we could just unify using the flat hive table as data 
source for all dictionaries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (KYLIN-2800) All dictionaries should be built based on the flat hive table

Reply via email to