[
https://issues.apache.org/jira/browse/PIG-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai resolved PIG-1709.
-----------------------------
Resolution: Fixed
Release Note: In skewed join, if one large key requires more reducers than
available, we give it all the available reducers.
Hadoop Flags: [Reviewed]
test-patch:
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 3 new or
modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the
total number of release audit warnings.
unit test:
all pass
end-to-end test:
all pass
Patch committed to both trunk and 0.8 branch.
> Skewed join use fewer reducer for extreme large key
> ---------------------------------------------------
>
> Key: PIG-1709
> URL: https://issues.apache.org/jira/browse/PIG-1709
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
> Attachments: PIG-1709-1.patch
>
>
> In skewed join, we use PartitionSkewedKeys to calculate number of reducers
> needed for a single key. If the result if larger than the number of total
> reducers, we will round it with reducer#. Eg, if Pig calculates that we need
> 12 reducers to hold a key in memory, and total reducers for this job is 10,
> we then allocate 2 reducers to this key; We shall use all 10 reducers in this
> case.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.