[ 
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated TEZ-808:
---------------------------
    Attachment: TEZ-808.branch-0.7.patch

Would it be possible to backport this to branch-0.7?  We're going to be on 0.7 
for a while, and we'd like this fix (along with TEZ-2918) to be able to catch 
hung tasks in production and automatically recover. 

Attaching a version of the patch for branch-0.7.  It came over fairly cleanly.

> Handle task attempts that are not making progress
> -------------------------------------------------
>
>                 Key: TEZ-808
>                 URL: https://issues.apache.org/jira/browse/TEZ-808
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>             Fix For: 0.8.2
>
>         Attachments: TEZ-808.1.patch, TEZ-808.2.patch, TEZ-808.3.patch, 
> TEZ-808.branch-0.7.patch
>
>
> If a task attempt is not making progress then it may cause the job to hang. 
> We may want to kill and restart the attempt. With speculation support and 
> free resources we may want to run another version in parallel.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to