[
https://issues.apache.org/jira/browse/PIG-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630727#action_12630727
]
Daniel Dai commented on PIG-364:
--------------------------------
I see the problem. The largest requestedParallelism determine the number of
reducers. I thought it was determined by the operator creating the map-reduce
boundary. Then I have to do this check after complete compiling a map-reduce
operator, if it contains a limit and requestedParallelism>1, then add a singler
reducer after that. Thank you for pointing out.
> Limit return incorrect records when we use multiple reducer
> -----------------------------------------------------------
>
> Key: PIG-364
> URL: https://issues.apache.org/jira/browse/PIG-364
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Reporter: Daniel Dai
> Assignee: Shravan Matthur Narayanamurthy
> Fix For: types_branch
>
> Attachments: PIG-364.patch
>
>
> Currently we put Limit(k) operator in the reducer plan. However, in the case
> of n reducer, we will get up to n*k output.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.