[ 
https://issues.apache.org/jira/browse/TEZ-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328628#comment-14328628
 ] 

Gopal V edited comment on TEZ-2001 at 2/20/15 6:56 AM:
-------------------------------------------------------

bq. Isnt straggler mitigation one of the primary motivations for this?

No, we aren't aiming this at stragglers - the issue is skewed data itself.

Running a skewed task again on a different node is probably not going to make 
it any faster to complete.

Speculation is unlikely to be successful in scenarios with significant skew. 

On a reliable mid-size cluster, turning that off might be a better win for 
throughput, particularly when dealing with the middle reducer of an MRR DAG 
(i.e two reducers pulling data off the same shuffle handlers & eating up 
bandwidth).


was (Author: gopalv):
bq. Isnt straggler mitigation one of the primary motivations for this?

No, we aren't aiming this at stragglers - the issue is skewed data itself.

Running a skewed task again on a different node is probably not going to make 
it any faster to complete.

Speculation is unlikely to be successful in scenarios with significant skew.

> Support pipelined data transfer for ordered output
> --------------------------------------------------
>
>                 Key: TEZ-2001
>                 URL: https://issues.apache.org/jira/browse/TEZ-2001
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2001.1.patch, TEZ-2001.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to