[ 
https://issues.apache.org/jira/browse/TEZ-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325423#comment-14325423
 ] 

Rajesh Balamohan commented on TEZ-2001:
---------------------------------------

>>
we can avoid final merge even without pipelining + some additional changes to 
fetcher side.
>>

Thinking through "Case 2" listed above.  We currently send the empty partition 
detail,  host url etc in the DM event.  There is no direct way to embed the 
empty partition bitset for multiple spills (e.g 5 spills in a task) in the same 
final DM event.  If we skip sending empty partition details, the number of http 
connections would bloat up for certain scenarios significantly.

With pipelining this was a lot more easier and natural,  as we would be able to 
send an event for every spill in PipelinedSorter.  And if the spill had some 
empty partition details, that would be embedded in the event itself.  

> Support pipelined data transfer for ordered output
> --------------------------------------------------
>
>                 Key: TEZ-2001
>                 URL: https://issues.apache.org/jira/browse/TEZ-2001
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2001.1.patch, TEZ-2001.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to