[ 
https://issues.apache.org/jira/browse/TEZ-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026633#comment-16026633
 ] 

Zhiyuan Yang commented on TEZ-3739:
-----------------------------------

Thanks [~sseth] for review!

bq. Tasks generating a large number of records, which would send a normal case 
way past the target maxParallelism.
This case was tested in testDAGVertexOnlyGroupByMaxParallelism

bq. Equal number of records from each source, and validate an equal weight to 
each of them
This case was tested in testDAGVertexOnlyGroupByMinOpsPerWorker (not exactly 
equal, but similar)

bq. Parallelism = MaxParallelism (instead of getting close to maxParallelism)
This is similar case as too much record, where we cap parallelism with 
maxParallelism.

> Fair CartesianProduct doesn't works well with huge difference in output size
> ----------------------------------------------------------------------------
>
>                 Key: TEZ-3739
>                 URL: https://issues.apache.org/jira/browse/TEZ-3739
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Zhiyuan Yang
>            Assignee: Zhiyuan Yang
>         Attachments: TEZ-3739.1.patch
>
>
> Specifically, the weighted factorization of initial parallelism goes crazy if 
> #record of each side is too different. The formula works in real number, but 
> not in integer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to