[
https://issues.apache.org/jira/browse/TEZ-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716570#comment-15716570
]
Zhiyuan Yang edited comment on TEZ-3552 at 12/2/16 9:30 PM:
------------------------------------------------------------
[~mingma] Are you going to work on this? If not, I'll take it over. Do you
think it's necessary to add a another knob for whether to shuffle splits? I
prefer not to.
was (Author: aplusplus):
[~mingma] Are you going to work on this? If not, I'll take it over. Do you
think it's necessary to add a another knob for whether to shuffle splits? I
prefer simply to do randomization if sorting is turned off.
> Shuffle split array when size-based sorting is turned off
> ---------------------------------------------------------
>
> Key: TEZ-3552
> URL: https://issues.apache.org/jira/browse/TEZ-3552
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Ming Ma
>
> TEZ-3430 adds the functionality to skip size-based split sorting to help with
> job runtime. During further testing, the original split array for certain
> inputs before sorting aren't randomly distributed in size. So when the spit
> sorting is turned off, we should shuffle the split instead of doing nothing.
> That will make the size distribution more even.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)