thanks, looks like that job only run once, when rebuild the cube, that job didn't run, and the cube processing is fast now
Sent from my iPhone > On Feb 2, 2016, at 6:02 PM, ShaoFeng Shi <[email protected]> wrote: > > KYLIN-1066 <https://issues.apache.org/jira/browse/KYLIN-1066> is irrelevant > with your issue, it was an intermediate issue when developing v2.0, you can > see its "affected Version" and "fixed Version" are all "v2.0"; > > The "Kylin Hive Column Cardinality Job" uses 1 reducer to merge the > HyperLogLog counters from mappers, to do a rough estimation on the column > cardinality; As the output from from each mapper is is a list of HLL > object, instead of the full distinct values, the data size is small (1KB * > # columns), so using 1 reducer to merge all output should be more efficient. > > Besides, this job is not a step in cube building, and is invisible from UI > so far, are you sure it is the slow one that you observed? > > > 2016-02-03 8:11 GMT+08:00 greg gu <[email protected]>: > >> By the way, the job step that uses 1 reducer is "Kylin Hive Column >> Cardinality Job ", is this expected? >> >>> From: [email protected] >>> To: [email protected] >>> Subject: only one reducer in job >>> Date: Tue, 2 Feb 2016 11:31:37 -0800 >>> >>> When I process the cube, I found there on only one reducer, which cause >> the job to run very long time. >>> I found this https://issues.apache.org/jira/browse/KYLIN-1066, it >> mentioned the issue is fixed. >>> >>> If there a way to change the number of reducer? >>> >>> Thanks, > > > > -- > Best regards, > > Shaofeng Shi
