thanks,

looks like that job only run once, when rebuild the cube, that job didn't run, 
and the cube processing is fast now

Sent from my iPhone

> On Feb 2, 2016, at 6:02 PM, ShaoFeng Shi <[email protected]> wrote:
> 
> KYLIN-1066 <https://issues.apache.org/jira/browse/KYLIN-1066> is irrelevant
> with your issue, it was an intermediate issue when developing v2.0, you can
> see its "affected Version" and "fixed Version" are all "v2.0";
> 
> The  "Kylin Hive Column Cardinality Job" uses 1 reducer to merge the
> HyperLogLog counters from mappers, to do a rough estimation on the column
> cardinality;  As the output from from each mapper is is a list of HLL
> object, instead of the full distinct values, the data size is small (1KB *
> # columns), so using 1 reducer to merge all output should be more efficient.
> 
> Besides, this job is not a step in cube building, and is invisible from UI
> so far, are you sure it is the slow one that you observed?
> 
> 
> 2016-02-03 8:11 GMT+08:00 greg gu <[email protected]>:
> 
>> By the way, the job step that uses 1 reducer is "Kylin Hive Column
>> Cardinality Job ", is this expected?
>> 
>>> From: [email protected]
>>> To: [email protected]
>>> Subject: only one reducer in job
>>> Date: Tue, 2 Feb 2016 11:31:37 -0800
>>> 
>>> When I process the cube, I found there on only one reducer, which cause
>> the job to run very long time.
>>> I found this https://issues.apache.org/jira/browse/KYLIN-1066, it
>> mentioned the issue is fixed.
>>> 
>>> If there a way to change the number of reducer?
>>> 
>>> Thanks,
> 
> 
> 
> -- 
> Best regards,
> 
> Shaofeng Shi

Reply via email to