[
https://issues.apache.org/jira/browse/KYLIN-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15822638#comment-15822638
]
Shaofeng SHI commented on KYLIN-2217:
-------------------------------------
Scan fact table twice is costly which we should avoid; I think the dictionaries
can be merged (in job node) after building in reducers; The memory footprint of
merge is much smaller than building, so it is acceptable for job node; will
this be better?
> Reducers build dictionaries locally
> -----------------------------------
>
> Key: KYLIN-2217
> URL: https://issues.apache.org/jira/browse/KYLIN-2217
> Project: Kylin
> Issue Type: Improvement
> Affects Versions: v1.5.4.1
> Reporter: XIE FAN
> Assignee: XIE FAN
> Fix For: v2.0.0
>
> Attachments: 0001-KYLIN-2217-Reducers-build-dictionaries-locally.patch
>
>
> In KYLIN-1851, we reduce the peek memory usage of the dictionary-building
> procedure by splitting a single Trie tree structure to Trie forest. But there
> still exist a bottleneck that all the dictionaries are built in Kylin client.
> In this issue, we want to use multi reducers to build different dictionaries
> locally and concurrently,which can further reduce the peek memory usage as
> well as speed up the dictionary-building procedure.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)