[ 
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955678#comment-14955678
 ] 

Jason Lowe commented on TEZ-808:
--------------------------------

No, we can't key off the progress field.  In practice progress can go forwards, 
backwards, and every which way.  It's too risky to depend upon progress always 
monotonically increasing and always being between 0.0 and 1.0 -- some input 
formats don't get this right.  Also a task consuming a really large input might 
be making progress but it's too miniscule to be measured directly in the 
progress field between two k,v pairs.  We should not kill tasks that are 
actually making progress of some sort, even if the reported progress doesn't 
indicate it.

MapReduce solves this by separating the concepts of reporting progress and a 
progress percentage.  The Reporter object provides the ability to report that 
progress is occurring but not by how much, while the input format is in charge 
of stating how much progress has been made overall.  Currently Tez is missing 
the concept of reporting some progress is being made, hence why I suggested 
above that we should consider adding a similar thing tied to input consumption 
or output generation by the task.  For example, the MapReduce framework 
automatically reports that progress is being made in these situations (not an 
exhaustive list):
- the task is asking for the next k,v pair
- a k,v pair is emitted by the map function
- k,v pairs are being spilled to disk
- the map function completes and cleanup has started
- a task is committing or aborting files
- every 10,000 compares during a merge

And in addition user code can explicitly call progress() on the Reporter to 
indicate progress is being made if they are doing a particularly expensive 
operation to process a single k,v pair.

The basic concept is like a watchdog timer.  progress() semantically resets the 
timer and keeps the AM from killing the task.  If nobody calls progress() then 
the task will stop reporting counters and task status, but it will still ping 
the AM to let it know the task is still technically alive.  If just pings and 
no real task status is reported to the AM within a 10 minute window (by 
default) then the AM will kill the task due to lack of progress.

So I was thinking that we would add a new boolean flag to TaskStatusUpdateEvent 
to indicate whether progress has been indicated by the task since the last time 
TaskStatusUpdateEvent was sent.  Then the AM could key off of that flag to kill 
the task if progress isn't made within a configurable timeout.  We would also 
sprinkle progress indications in appropriate places in the Tez framework, like 
shuffle code, merge code, input processing, etc. where we can tell at the 
framework level that the task is moving forward.


> Handle task attempts that are not making progress
> -------------------------------------------------
>
>                 Key: TEZ-808
>                 URL: https://issues.apache.org/jira/browse/TEZ-808
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>
> If a task attempt is not making progress then it may cause the job to hang. 
> We may want to kill and restart the attempt. With speculation support and 
> free resources we may want to run another version in parallel.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to