[
https://issues.apache.org/jira/browse/PIG-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Graham updated PIG-2652:
-----------------------------
Attachment: PIG-2652_1.patch
Here's a patch that sets the number of reducers to -1 in {{MRController}} for
sampled operations if it hasn't been set larger than 1. This will then trigger
the reducer estimator in {{JobControlCompiler}}.
A related fix would be to not do the sampling if someone has set number of
reducers explicitly to 1.
> Skew join and order by don't trigger reducer estimation
> -------------------------------------------------------
>
> Key: PIG-2652
> URL: https://issues.apache.org/jira/browse/PIG-2652
> Project: Pig
> Issue Type: Bug
> Reporter: Bill Graham
> Attachments: PIG-2652_1.patch
>
>
> If neither PARALLEL, default parallel or {{mapred.reduce.tasks}} are set, the
> number of reducers is not estimated based on input size for skew joins or
> order by. Instead, these jobs get only 1 reducer.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira