kangkaisen created KYLIN-2764:
---------------------------------
Summary: Build the dict for UHC column with MR
Key: KYLIN-2764
URL: https://issues.apache.org/jira/browse/KYLIN-2764
Project: Kylin
Issue Type: Improvement
Components: Job Engine
Affects Versions: v2.0.0
Reporter: kangkaisen
Assignee: kangkaisen
KYLIN-2217 has built dict for normal column with MR, but the UHC column still
build dict in JobServer. Like KYLIN-2217, we also could use MR build dict for
UHC column. which could thoroughly release the memory pressure and improve job
concurrent for JobServer as well as speed up multi UHC columns procedure.
The MR input is the output of "Extract Fact Table Distinct Columns", the MR
output is the UHC column dict. Because it is very hard build global dict with
multi reducers, I use one reducer handle one UHC column and allocate enough
memory to the reducer. According to my test, 8G memory is enough.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)