[
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447195#comment-15447195
]
Jason Lowe commented on MAPREDUCE-6771:
---------------------------------------
bq. so my understanding of this is there should be ideally one such event in
the jhist file
Yes, ideally we should avoid emitting more than one
TaskAttemptUnsuccessfulCompletion event. There are other tools besides the JHS
that look at these jhist files, and I don't know how well they will handle more
than one of these for the same attempt.
Note that we aren't stuck with TaskAttemptUnsuccessfulCompletion event for
doing diagnostics. We could use some new diagnostic event just for this
purpose, but that too could cause troubles for jhist parsers that don't skip
unknown records.
As for the postponing we probably can move it farther down the state machine,
but waiting for a container completion event is not something the state machine
does today. For example, the FAIL_FINISHING_CONTAINER state is just waiting
for the AM to send a kill container request to the NM and not actually waiting
for the container completion event. It would need to do so. Another issue
with postponing is the dependency on the container completion event. There
have been issues in the past where the MR AM "missed" a container completion
event and caused a scheduling hang. We'd need some kind of safety valve to
prevent the AM from waiting forever for a completion event that would never
arrive. Another issue with waiting is that if the AM crashes after a task
reported failure but before the container completion event arrived then that
won't be noticed by the subsequent AM attempt. (Yes, this race occurs today
but the window would be significantly wider.) Those kinds of issues makes the
"lets record some additional diagnostics after the fact" approach more
appealing, since we do exactly what we do today with an addendum if a container
completion event has more info after an attempt completion has already been
recorded in the jhist file.
Both approaches have pros and cons, and I'm not sure which I prefer yet.
> Diagnostics information can be lost in .jhist if task containers are killed
> by Node Manager.
> --------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 2.7.3
> Reporter: Haibo Chen
> Assignee: Haibo Chen
> Attachments: TaUnsuccessfullyEventEmission.jpg,
> mapreduce6771.001.patch
>
>
> Task containers can go over their resource limit, and killed by Node Manager.
> Then MR AM gets notified of the container status and diagnostics information
> through its heartbeat with RM. However, it is possible that the diagnostics
> information never gets into .jhist file, so when the job completes, the
> diagnostics information associated with the failed task attempts is empty.
> This makes it hard for users to root cause job failures that are often caused
> by memory leak.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]