[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955678#comment-14955678
]
Jason Lowe commented on TEZ-808:
--------------------------------
No, we can't key off the progress field. In practice progress can go forwards,
backwards, and every which way. It's too risky to depend upon progress always
monotonically increasing and always being between 0.0 and 1.0 -- some input
formats don't get this right. Also a task consuming a really large input might
be making progress but it's too miniscule to be measured directly in the
progress field between two k,v pairs. We should not kill tasks that are
actually making progress of some sort, even if the reported progress doesn't
indicate it.
MapReduce solves this by separating the concepts of reporting progress and a
progress percentage. The Reporter object provides the ability to report that
progress is occurring but not by how much, while the input format is in charge
of stating how much progress has been made overall. Currently Tez is missing
the concept of reporting some progress is being made, hence why I suggested
above that we should consider adding a similar thing tied to input consumption
or output generation by the task. For example, the MapReduce framework
automatically reports that progress is being made in these situations (not an
exhaustive list):
- the task is asking for the next k,v pair
- a k,v pair is emitted by the map function
- k,v pairs are being spilled to disk
- the map function completes and cleanup has started
- a task is committing or aborting files
- every 10,000 compares during a merge
And in addition user code can explicitly call progress() on the Reporter to
indicate progress is being made if they are doing a particularly expensive
operation to process a single k,v pair.
The basic concept is like a watchdog timer. progress() semantically resets the
timer and keeps the AM from killing the task. If nobody calls progress() then
the task will stop reporting counters and task status, but it will still ping
the AM to let it know the task is still technically alive. If just pings and
no real task status is reported to the AM within a 10 minute window (by
default) then the AM will kill the task due to lack of progress.
So I was thinking that we would add a new boolean flag to TaskStatusUpdateEvent
to indicate whether progress has been indicated by the task since the last time
TaskStatusUpdateEvent was sent. Then the AM could key off of that flag to kill
the task if progress isn't made within a configurable timeout. We would also
sprinkle progress indications in appropriate places in the Tez framework, like
shuffle code, merge code, input processing, etc. where we can tell at the
framework level that the task is moving forward.
> Handle task attempts that are not making progress
> -------------------------------------------------
>
> Key: TEZ-808
> URL: https://issues.apache.org/jira/browse/TEZ-808
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Bikas Saha
>
> If a task attempt is not making progress then it may cause the job to hang.
> We may want to kill and restart the attempt. With speculation support and
> free resources we may want to run another version in parallel.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)