[ 
https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153493#comment-15153493
 ] 

Bikas Saha commented on TEZ-3126:
---------------------------------

I am sorry I might not have understood your comments fully. Are you saying, 
round up the partitions to merge and change the number of tasks instead?

Wouldn't that break the max data per reducer limit? Ignoring the min data hint 
may be fine but ignoring the max data limit could result in failure because it 
may break operator assumptions (eg. size of hash table etc.). Say the reducer 
was designed to handle 1G of data and we send it 1.7G instead.


> Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than 
> half.
> ---------------------------------------------------------------------------------
>
>                 Key: TEZ-3126
>                 URL: https://issues.apache.org/jira/browse/TEZ-3126
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>            Priority: Critical
>         Attachments: TEZ-3126.1.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange 
> will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
>     int desiredTaskParallelism = 
>         (int)(
>             (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
>             desiredTaskInputDataSize);
>     if(desiredTaskParallelism < minTaskParallelism) {
>       desiredTaskParallelism = minTaskParallelism;
>     }
>     
>     if(desiredTaskParallelism >= currentParallelism) {
>       return true;
>     }
>     
>     // most shufflers will be assigned this range
>     basePartitionRange = currentParallelism/desiredTaskParallelism;
>     
>     if (basePartitionRange <= 1) {
>       // nothing to do if range is equal 1 partition. shuffler does it by 
> default
>       return true;
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to