[
https://issues.apache.org/jira/browse/PIG-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13835093#comment-13835093
]
Daniel Dai commented on PIG-3571:
---------------------------------
This is the task limit. POLimit has a static variable to keep track how many
records been processed. However, for every <key, value> pair, we still need to
attach to the pipeline and pull results from the leaf. When limit hit, we keep
pulling EOP from the leaf. Though we get the right final result, it is a waste
to go through the pipeline when we know the limit hit. Not sure if I understand
your question correctly, EOP is a concept of pipeline (POLimit is part of it),
MR iterate through <key,value> without knowing about EOP.
> Early termination when processing limit
> ---------------------------------------
>
> Key: PIG-3571
> URL: https://issues.apache.org/jira/browse/PIG-3571
> Project: Pig
> Issue Type: Bug
> Components: impl
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3571-0.patch
>
>
> When we pull enough records for POLimit, currently we still pull the whole
> pipeline for every key. We could use a flag to avoid that.
--
This message was sent by Atlassian JIRA
(v6.1#6144)