[ 
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated TEZ-2972:
----------------------------
    Attachment: TEZ-2972.003.addendum.patch
                TEZ-2972-branch-0.7.001.patch

Attached is a patch for branch-0.7.  Reviews welcome.

I also noticed that the patch for master introduced a findbugs warning.  
Attaching an addendum patch for that as well, or we can revert and recommit if 
that's preferred.

> Avoid task rescheduling when a node turns unhealthy
> ---------------------------------------------------
>
>                 Key: TEZ-2972
>                 URL: https://issues.apache.org/jira/browse/TEZ-2972
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>             Fix For: 0.8.2
>
>         Attachments: TEZ-2972-branch-0.7.001.patch, TEZ-2972.001.patch, 
> TEZ-2972.002.patch, TEZ-2972.003.addendum.patch, TEZ-2972.003.patch
>
>
> This is similar to MAPREDUCE-6119.  Sometimes reacting to a node update event 
> can cause more harm than good.  For example, an UNHEALTHY node may be able to 
> shuffle just fine.  Therefore obsoleting the output of tasks that ran on that 
> node and re-running them simply adds more overhead to the job with no 
> benefit.  It would be nice to be able to configure Tez to ignore node update 
> events if desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to