[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955591#comment-14955591
]
Bikas Saha commented on TEZ-808:
--------------------------------
Right. I just wanted to make sure the situation is a real no-progress hang and
thus relevant to this jira vs some others like speculation etc.
Are you suggesting we set up a timeout on the progress field sent in the
heartbeat and kill the task attempt if progress does not increase for a given
time range?
> Handle task attempts that are not making progress
> -------------------------------------------------
>
> Key: TEZ-808
> URL: https://issues.apache.org/jira/browse/TEZ-808
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Bikas Saha
>
> If a task attempt is not making progress then it may cause the job to hang.
> We may want to kill and restart the attempt. With speculation support and
> free resources we may want to run another version in parallel.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)