[ 
https://issues.apache.org/jira/browse/TEZ-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716570#comment-15716570
 ] 

Zhiyuan Yang edited comment on TEZ-3552 at 12/2/16 9:30 PM:
------------------------------------------------------------

[~mingma] Are you going to work on this? If not, I'll take it over. Do you 
think it's necessary to add a another knob for whether to shuffle splits? I 
prefer not to.


was (Author: aplusplus):
[~mingma] Are you going to work on this? If not, I'll take it over. Do you 
think it's necessary to add a another knob for whether to shuffle splits? I 
prefer simply to do randomization if sorting is turned off.

> Shuffle split array when size-based sorting is turned off
> ---------------------------------------------------------
>
>                 Key: TEZ-3552
>                 URL: https://issues.apache.org/jira/browse/TEZ-3552
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Ming Ma
>
> TEZ-3430 adds the functionality to skip size-based split sorting to help with 
> job runtime. During further testing, the original split array for certain 
> inputs before sorting aren't randomly distributed in size. So when the spit 
> sorting is turned off, we should shuffle the split instead of doing nothing. 
> That will make the size distribution more even.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to