Hi kylin team: Step: Redistribute intermediate table # 默认选择了维度的前三个字段作为DISTRIBUTE BY的依据,没有采用DISTRIBUTE BY RAND() 如果没有合适的维度字段,这样的默认策略将会导致数据更加的数据不均衡。
Best Regards! > 在 2018年11月2日,下午12:03,liuzhixin <liuz...@163.com> 写道: > > Hi kylin team: > > Version: Kylin2.5-hadoop3.1 for hdp3.0 > # > Step: Redistribute intermediate table > # > DISTRIBUTE BY is that: > INSERT OVERWRITE TABLE table_intermediate SELECT * FROM table_intermediate > DISTRIBUTE BY Field1, Field2, Field3; > # > Not DISTRIBUTE BY RAND() > # > Is this default DISTRIBUTE BY Field1, Field2, Field3? how to DISTRIBUTE BY > RAND()? > > Best wishes. >