[ 
https://issues.apache.org/jira/browse/PIG-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1709:
----------------------------

    Attachment: PIG-1709-1.patch

Current code will issue a warning in case we don't have enough reducers. 
However, the warning message is easily get ignored, and skewed join will 
continue with something very wrong. The attached patch will use all available 
reducers in this case, and we will still see warning message.

> Skewed join use fewer reducer for extreme large key
> ---------------------------------------------------
>
>                 Key: PIG-1709
>                 URL: https://issues.apache.org/jira/browse/PIG-1709
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1709-1.patch
>
>
> In skewed join, we use PartitionSkewedKeys to calculate number of reducers 
> needed for a single key. If the result if larger than the number of total 
> reducers, we will round it with reducer#. Eg, if Pig calculates that we need 
> 12 reducers to hold a key in memory, and total reducers for this job is 10, 
> we then allocate 2 reducers to this key; We shall use all 10 reducers in this 
> case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to