Mohit Sabharwal created PIG-4542:
------------------------------------
Summary: OutputConsumerIterator should flush buffered records
Key: PIG-4542
URL: https://issues.apache.org/jira/browse/PIG-4542
Project: Pig
Issue Type: Sub-task
Components: spark
Affects Versions: spark-branch
Reporter: Mohit Sabharwal
Assignee: Mohit Sabharwal
Fix For: spark-branch
Certain operators may buffer the output. We need to flush the last set of
records from such operators, when we encounter the last input record, before
calling getNextTuple() for the last time.
Currently, to flush the last set of records, we compute RDD.count() and compare
the count with a running counter to determine if we have reached the last
record. This is an unnecessary and inefficient.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)