[ 
https://issues.apache.org/jira/browse/KYLIN-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15822638#comment-15822638
 ] 

Shaofeng SHI commented on KYLIN-2217:
-------------------------------------

Scan fact table twice is costly which we should avoid; I think the dictionaries 
can be merged (in job node) after building in reducers; The memory footprint of 
merge is much smaller than building, so it is acceptable for job node; will 
this be better?

> Reducers build dictionaries locally
> -----------------------------------
>
>                 Key: KYLIN-2217
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2217
>             Project: Kylin
>          Issue Type: Improvement
>    Affects Versions: v1.5.4.1
>            Reporter: XIE FAN
>            Assignee: XIE FAN
>             Fix For: v2.0.0
>
>         Attachments: 0001-KYLIN-2217-Reducers-build-dictionaries-locally.patch
>
>
> In KYLIN-1851, we reduce the peek memory usage of the dictionary-building 
> procedure by splitting a single Trie tree structure to Trie forest. But there 
> still exist a bottleneck that all the dictionaries are built in Kylin client. 
> In this issue, we want to use multi reducers to build different dictionaries 
> locally and concurrently,which can further reduce the peek memory usage as 
> well as speed up the dictionary-building procedure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to