Reduce "kylin.job.mapreduce.default.reduce.input.mb" will give you more reducers and can speed up the MR if the bottleneck is in reducer and there are extra reducer slots in your cluster.
However there are many other reasons why a MR is slow. E.g. data skew, where a certain mapper or reducer gets a extremely big chunk of data and slow down the whole job. Based on experience, it's not common that a count distinct being the main reason of a slow job. On Thu, Jan 21, 2016 at 5:36 PM, 杨海乐 <[email protected]> wrote: > I find that The reason is the precision of count distinct measure. The > precision is 1.2% . So the steps is too slow even though the data is > little(million).Can I sloue the problem by reducing the value of > kylin.job.mapreduce.default.reduce.input.mb? > > -- > View this message in context: > http://apache-kylin.74782.x6.nabble.com/From-the-Build-Base-Cuboid-Data-step-to-Build-N-Dimension-steps-Too-much-time-is-taken-tp3351p3368.html > Sent from the Apache Kylin mailing list archive at Nabble.com. >
