[
https://issues.apache.org/jira/browse/PIG-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-4162:
------------------------------------
Attachment: PIG-4162-1.patch
Following changes are done:
- Always estimate intermediate reducer parallelism even if user has specified
PARALLEL.
- intermediate reducer parallelism = Min(2 * userparallelism,
Math.max(userparallelism, Math.max(estimatedparallelism,
Math.max(2999,PigReducerEstimator.MAX_REDUCER_COUNT_PARAM)). i.e Limiting
estimated parallelism to be not more than 2x userparallelism or 2999.
Hardcoding 2999 for now which is different from final reducer max parallelism
default of 999 and is only for intermediate reducers. Will make it configurable
later if needed.
- ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_DESIRED_TASK_INPUT_SIZE is
set to blocksize for intermediate tasks(same as mapper behaviour) instead of
InputSizeReducerEstimator.DEFAULT_BYTES_PER_REDUCER which defaults to 1G
Patch has few other minor unrelated fixes as well.
> Intermediate reducer parallelism in Tez should be higher
> --------------------------------------------------------
>
> Key: PIG-4162
> URL: https://issues.apache.org/jira/browse/PIG-4162
> Project: Pig
> Issue Type: Sub-task
> Components: tez
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.14.0
>
> Attachments: PIG-4162-1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)