[
https://issues.apache.org/jira/browse/TEZ-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224123#comment-16224123
]
Zhiyuan Yang commented on TEZ-3819:
-----------------------------------
[~gopalv] Even val-based hash partitioner cannot handle skew from repeated
value, there is a very tricky way to work around this (in Hive): select one
more irrelevant but unskewed column, the skew will go away. More work, less
time.
> Round robin partitioner make fair cartesian product not fault tolerant
> ----------------------------------------------------------------------
>
> Key: TEZ-3819
> URL: https://issues.apache.org/jira/browse/TEZ-3819
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Zhiyuan Yang
> Assignee: Zhiyuan Yang
> Attachments: TEZ-3819.patch
>
>
> In case of task failure and retry, round robin partitioner cannot give
> consistent partitioning, which can cause wrong output of fair cartesian
> product.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)