[ 
https://issues.apache.org/jira/browse/KYLIN-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-2866:
----------------------------------
    Description: 
Currently only one reducer is assigned for hll stats calculation, which may 
become the bottleneck for slow down this step. Since the stats for different 
cuboids will not influence each other, it's better to divide the cuboid set 
into several and assign a reduce for each subset.
The strategy of this patch is to assign 100 cuboids into a subset. And there's 
a upper limit of reducers for hll stats calculation. Currently it's 50.

> Enlarge the reducer number for hyperloglog statistics calculation at step 
> FactDistinctColumnsJob
> ------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-2866
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2866
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>            Reporter: Zhong Yanghong
>            Assignee: Zhong Yanghong
>
> Currently only one reducer is assigned for hll stats calculation, which may 
> become the bottleneck for slow down this step. Since the stats for different 
> cuboids will not influence each other, it's better to divide the cuboid set 
> into several and assign a reduce for each subset.
> The strategy of this patch is to assign 100 cuboids into a subset. And 
> there's a upper limit of reducers for hll stats calculation. Currently it's 
> 50.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to