Hi wang,
   "DISTRIBUTE BY RAND()" may cause data inconsistency, so we changed it to
distribute by the first few columns of  the rowkey, and the default is the
first 3 columns. You can see the following issues for more details. If you
don't have a data skew problem, you can just disable the "redistribute"
step by setting "kylin.source.hive.redistribute-flat-table" to false.
https://issues.apache.org/jira/browse/KYLIN-3388
https://issues.apache.org/jira/browse/KYLIN-3457

On Thu, Jun 13, 2019 at 5:03 PM [email protected] <[email protected]>
wrote:

> 按照文档说法,重新分发中间表的时候是随机方式DISTRIBUTE BY
> RAND(),我的cube里没有指定分片字段,但是不是按照随机方式分发的,而是取的维度字段里的前3个字段,由于cube里的维度没有高基维度导致数据倾斜,
> 怎么设置才能随机分发呢,或者有什么好的建议
>
> ------------------------------
> [email protected]
>

Reply via email to