[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447195#comment-15447195
 ] 

Jason Lowe commented on MAPREDUCE-6771:
---------------------------------------

bq.  so my understanding of this is there should be ideally one such event in 
the jhist file

Yes, ideally we should avoid emitting more than one 
TaskAttemptUnsuccessfulCompletion event.  There are other tools besides the JHS 
that look at these jhist files, and I don't know how well they will handle more 
than one of these for the same attempt.

Note that we aren't stuck with TaskAttemptUnsuccessfulCompletion event for 
doing diagnostics.  We could use some new diagnostic event just for this 
purpose, but that too could cause troubles for jhist parsers that don't skip 
unknown records.

As for the postponing we probably can move it farther down the state machine, 
but waiting for a container completion event is not something the state machine 
does today.  For example, the FAIL_FINISHING_CONTAINER state is just waiting 
for the AM to send a kill container request to the NM and not actually waiting 
for the container completion event.  It would need to do so.  Another issue 
with postponing is the dependency on the container completion event.  There 
have been issues in the past where the MR AM "missed" a container completion 
event and caused a scheduling hang.  We'd need some kind of safety valve to 
prevent the AM from waiting forever for a completion event that would never 
arrive.  Another issue with waiting is that if the AM crashes after a task 
reported failure but before the container completion event arrived then that 
won't be noticed by the subsequent AM attempt.  (Yes, this race occurs today 
but the window would be significantly wider.)  Those kinds of issues makes the 
"lets record some additional diagnostics after the fact" approach more 
appealing, since we do exactly what we do today with an addendum if a container 
completion event has more info after an attempt completion has already been 
recorded in the jhist file.

Both approaches have pros and cons, and I'm not sure which I prefer yet.

> Diagnostics information can be lost in .jhist if task containers are killed 
> by Node Manager.
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6771
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.7.3
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>         Attachments: TaUnsuccessfullyEventEmission.jpg, 
> mapreduce6771.001.patch
>
>
> Task containers can go over their resource limit, and killed by Node Manager. 
> Then MR AM gets notified of the container status and diagnostics information 
> through its heartbeat with RM.  However, it is possible that the diagnostics 
> information never gets into .jhist file, so when the job completes, the 
> diagnostics information associated with the failed task attempts is empty.  
> This makes it hard for users to root cause job failures that are often caused 
> by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to