[
https://issues.apache.org/jira/browse/PIG-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928869#action_12928869
]
Daniel Dai commented on PIG-1709:
---------------------------------
As Thejas point out, the right approach and original design is to fail over in
such case. This is broken and we should fix it.
> Skewed join use fewer reducer for extreme large key
> ---------------------------------------------------
>
> Key: PIG-1709
> URL: https://issues.apache.org/jira/browse/PIG-1709
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
>
> In skewed join, we use PartitionSkewedKeys to calculate number of reducers
> needed for a single key. If the result if larger than the number of total
> reducers, we will round it with reducer#. Eg, if Pig calculates that we need
> 12 reducers to hold a key in memory, and total reducers for this job is 10,
> we then allocate 2 reducers to this key; We shall use all 10 reducers in this
> case.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.