Mohit Sabharwal created PIG-4542:
------------------------------------

             Summary: OutputConsumerIterator should flush buffered records
                 Key: PIG-4542
                 URL: https://issues.apache.org/jira/browse/PIG-4542
             Project: Pig
          Issue Type: Sub-task
          Components: spark
    Affects Versions: spark-branch
            Reporter: Mohit Sabharwal
            Assignee: Mohit Sabharwal
             Fix For: spark-branch


Certain operators may buffer the output. We need to flush the last set of 
records from such operators, when we encounter the last input record, before 
calling getNextTuple() for the last time.

Currently, to flush the last set of records, we compute RDD.count() and compare 
the count with a running counter to determine if we have reached the last 
record. This is an unnecessary and inefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to