[
https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571
]
Saikat commented on TEZ-2643:
-----------------------------
Thanks [~rajesh.balamohan] for the review comments.
Made the following changes in patchset 2643.1
comment 2: move the spillrecords init to after the check for
ignoreSpillIfNeeded.
Comment 1:
I didnt want to put the sendPipelinedShuffleEvents() inside spill because of
the following scenario:
a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we
need to change the if (!isFinalMergeEnabled) {} where only event is sent out
for last spill.
b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we
wud need to pass lastEvent flag to sendPipelinedShuffleEvents().
There would be too many chanes, so I return a boolean value from spill, to let
the caller know it there was actually a spill, and then the caller can take a
decision to send events and if its a last event etc.
> Minimize number of empty spills in Pipelined Sorter
> ---------------------------------------------------
>
> Key: TEZ-2643
> URL: https://issues.apache.org/jira/browse/TEZ-2643
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Saikat
> Assignee: Saikat
> Attachments: TEZ-2643.1.patch, TEZ-2643.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)