[
https://issues.apache.org/jira/browse/TEZ-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325423#comment-14325423
]
Rajesh Balamohan commented on TEZ-2001:
---------------------------------------
>>
we can avoid final merge even without pipelining + some additional changes to
fetcher side.
>>
Thinking through "Case 2" listed above. We currently send the empty partition
detail, host url etc in the DM event. There is no direct way to embed the
empty partition bitset for multiple spills (e.g 5 spills in a task) in the same
final DM event. If we skip sending empty partition details, the number of http
connections would bloat up for certain scenarios significantly.
With pipelining this was a lot more easier and natural, as we would be able to
send an event for every spill in PipelinedSorter. And if the spill had some
empty partition details, that would be embedded in the event itself.
> Support pipelined data transfer for ordered output
> --------------------------------------------------
>
> Key: TEZ-2001
> URL: https://issues.apache.org/jira/browse/TEZ-2001
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: TEZ-2001.1.patch, TEZ-2001.2.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)