for most of the cases, one reducer is okay for merge the distinct values from all dimension columns on fact table; but if there are multiple ultra high cardinality columns, using multiple reducers would gain better concurrency. Actually this is the task I'm doing today, as a part of work for another feature, it will be rollout in a certain release after 2.0 By the way, please try using English for getting wider audience. 发送自 Outlook Mobile
On Sat, Jan 30, 2016 at 11:02 PM -0800, "热爱大发挥" <[email protected]> wrote: 我的fact表数据量为5000万左右, 给build cube 第二部的时候(Extact Fact Table Distinct Columns), reduce数量为什么都是1呢, 看了源代码确实是写死了1个, 这就导致了单个节点的负载过高,内存不足导致job退出了.这个问题该如何解决呢, 能否订制每个步奏的mapreduce参数呢?
