[
https://issues.apache.org/jira/browse/TEZ-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025783#comment-16025783
]
Siddharth Seth commented on TEZ-3739:
-------------------------------------
I think we should add a few more tests here (maybe in a follow up jira).
- Tasks generating a large number of records, which would send a normal case
way past the target maxParallelism.
- Equal number of records from each source, and validate an equal weight to
each of them
- Parallelism = MaxParallelism (instead of getting close to maxParallelism)
> Fair CartesianProduct doesn't works well with huge difference in output size
> ----------------------------------------------------------------------------
>
> Key: TEZ-3739
> URL: https://issues.apache.org/jira/browse/TEZ-3739
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Zhiyuan Yang
> Assignee: Zhiyuan Yang
> Attachments: TEZ-3739.1.patch
>
>
> Specifically, the weighted factorization of initial parallelism goes crazy if
> #record of each side is too different. The formula works in real number, but
> not in integer.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)