[ 
https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571
 ] 

Saikat edited comment on TEZ-2643 at 9/8/15 8:42 PM:
-----------------------------------------------------

Thanks [~rajesh.balamohan] for the review comments.
Made the following changes in patchset 2643.1
comment 2: move the spillrecords init to after the check for 
ignoreSpillIfNeeded.

Comment 1: 
I didnt want to put the sendPipelinedShuffleEvents() inside spill because of 
the following scenario:
a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we 
need to change the if (!isFinalMergeEnabled) {} where  only one event is sent 
out for last spill.
b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we 
wud need to pass lastEvent flag to sendPipelinedShuffleEvents().
There would be too many changes, so I return a boolean value from spill, to let 
the caller know it there was actually a spill, and then the caller can take a 
decision to send events and if its a last event etc.


was (Author: saikatr):
Thanks [~rajesh.balamohan] for the review comments.
Made the following changes in patchset 2643.1
comment 2: move the spillrecords init to after the check for 
ignoreSpillIfNeeded.

Comment 1: 
I didnt want to put the sendPipelinedShuffleEvents() inside spill because of 
the following scenario:
a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we 
need to change the if (!isFinalMergeEnabled) {} where  only one event is sent 
out for last spill.
b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we 
wud need to pass lastEvent flag to sendPipelinedShuffleEvents().
There would be too many chanes, so I return a boolean value from spill, to let 
the caller know it there was actually a spill, and then the caller can take a 
decision to send events and if its a last event etc.

> Minimize number of empty spills in Pipelined Sorter
> ---------------------------------------------------
>
>                 Key: TEZ-2643
>                 URL: https://issues.apache.org/jira/browse/TEZ-2643
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Saikat
>            Assignee: Saikat
>         Attachments: TEZ-2643.1.patch, TEZ-2643.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to