[
https://issues.apache.org/jira/browse/TEZ-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065041#comment-17065041
]
David Mollitor commented on TEZ-4130:
-------------------------------------
I think the override needs to occur before this block of code, otherwise
desiredNumSplits may change after it is used here.
{code:java}
if (desiredNumSplits == 0 ||
originalSplits.size() == 0 ||
desiredNumSplits >= originalSplits.size()) {
...
}
{code}
Also, there should be some sanity check that {{configMaxSplits}} has a value
greater than zero.
Finally, please change the name of the configuration to be
{{tez.grouping.split-count.max}} so that it sorts alphabetically with
{{{{tez.grouping.split-count}}.
> Config for hard limiting the number of splits
> ---------------------------------------------
>
> Key: TEZ-4130
> URL: https://issues.apache.org/jira/browse/TEZ-4130
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Attachments: TEZ-4130.01.patch
>
>
> During the investigation of a customer issue, I found that tez generated a
> dag plan containing >4k tasks. It failed for hive because of bucket number
> limitations (4k). It can be configured properly, e.g. bigger splits
> (tez.grouping.min-size), but maybe it would be more convenient for users to
> config a hard limit for the number of splits.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)