[ 
https://issues.apache.org/jira/browse/PIG-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-1709.
-----------------------------

      Resolution: Fixed
    Release Note: In skewed join, if one large key requires more reducers than 
available, we give it all the available reducers.
    Hadoop Flags: [Reviewed]

test-patch:
     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or 
modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

unit test:
    all pass

end-to-end test:
    all pass

Patch committed to both trunk and 0.8 branch.

> Skewed join use fewer reducer for extreme large key
> ---------------------------------------------------
>
>                 Key: PIG-1709
>                 URL: https://issues.apache.org/jira/browse/PIG-1709
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1709-1.patch
>
>
> In skewed join, we use PartitionSkewedKeys to calculate number of reducers 
> needed for a single key. If the result if larger than the number of total 
> reducers, we will round it with reducer#. Eg, if Pig calculates that we need 
> 12 reducers to hold a key in memory, and total reducers for this job is 10, 
> we then allocate 2 reducers to this key; We shall use all 10 reducers in this 
> case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to