[
https://issues.apache.org/jira/browse/PIG-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207014#comment-13207014
]
Doug Daniels commented on PIG-2524:
-----------------------------------
I do have an "order by" in front of the limit. Thanks Daniel—I'll close this
as a dupe of PIG-2337 and apply that patch.
> LIMIT is not effective when reducers are estimated higher than 1
> ----------------------------------------------------------------
>
> Key: PIG-2524
> URL: https://issues.apache.org/jira/browse/PIG-2524
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.9.2
> Reporter: Doug Daniels
>
> If the user does not provide a default # of reducers on the operation or the
> script, pig estimates the number of reducers based on the input data size.
> If that estimate yields > 1 reducer, then LIMIT operators can produce more
> results than they are supposed to.
> This happens b/c the reducer estimation code
> (JobControlCompiler.estimateNumberOfReducers) runs after the LimitAdjuster
> code (MRCompiler.compile). So, if the reducer estimation uses > 1 reducer,
> then the LimitAdjuster will not have added an extra 1-reducer MR job to
> enforce the proper limit.
> This seems related to PIG-2295.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira