[
https://issues.apache.org/jira/browse/TEZ-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126590#comment-16126590
]
Gopal V commented on TEZ-3819:
------------------------------
bq. although skew caused by repeated key becomes inevitable.
Is this using the hashpartitioner on the key? The key will be a 0 byte
structure, because the cross-product is used only when the key is an empty list.
For some reason, I assumed that the Hashpartitioner would be applied on the
serialized value bytes, which is a workaround to the repeatability problem.
> Round robin partitioner make fair cartesian product not fault tolerant
> ----------------------------------------------------------------------
>
> Key: TEZ-3819
> URL: https://issues.apache.org/jira/browse/TEZ-3819
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Zhiyuan Yang
> Assignee: Zhiyuan Yang
> Attachments: TEZ-3819.patch
>
>
> In case of task failure and retry, round robin partitioner cannot give
> consistent partitioning, which can cause wrong output of fair cartesian
> product.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)