[
https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571
]
Saikat edited comment on TEZ-2643 at 9/8/15 9:18 PM:
-----------------------------------------------------
Thanks [~rajesh.balamohan] for the review comments.
Made the following changes in patchset 2643.2
comment 2:
a. move the spillrecords init to after the check for ignoreSpillIfNeeded.
b. renamed to ignoreEmptySpills
Comment 1:
I didnt want to put the sendPipelinedShuffleEvents() inside spill because of
the following scenario:
a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we
need to change the if (!isFinalMergeEnabled) {} where only one event is sent
out for last spill.
b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we
would need to pass lastEvent flag to sendPipelinedShuffleEvents().
There would be too many changes, so I return a boolean value from spill, to let
the caller know if there was actually a spill or not, and then the caller can
take a decision to send events and if its a last event etc.
was (Author: saikatr):
Thanks [~rajesh.balamohan] for the review comments.
Made the following changes in patchset 2643.1
comment 2: move the spillrecords init to after the check for
ignoreSpillIfNeeded.
Comment 1:
I didnt want to put the sendPipelinedShuffleEvents() inside spill because of
the following scenario:
a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we
need to change the if (!isFinalMergeEnabled) {} where only one event is sent
out for last spill.
b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we
wud need to pass lastEvent flag to sendPipelinedShuffleEvents().
There would be too many changes, so I return a boolean value from spill, to let
the caller know it there was actually a spill, and then the caller can take a
decision to send events and if its a last event etc.
> Minimize number of empty spills in Pipelined Sorter
> ---------------------------------------------------
>
> Key: TEZ-2643
> URL: https://issues.apache.org/jira/browse/TEZ-2643
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Saikat
> Assignee: Saikat
> Attachments: TEZ-2643.1.patch, TEZ-2643.2.patch, TEZ-2643.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)