[
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440372#comment-15440372
]
Haibo Chen commented on MAPREDUCE-6771:
---------------------------------------
Analysis:
{code:java}
RMContainerAllocator.getResources() {
...
for (ContainerStatus cont : finishedContainers) {
LOG.info("Received completed container " + cont.getContainerId());
TaskAttemptId attemptID = assignedRequests.get(cont.getContainerId());
if (attemptID == null) {
LOG.error("Container complete event for unknown container id "
+ cont.getContainerId());
} else {
pendingRelease.remove(cont.getContainerId());
assignedRequests.remove(attemptID);
// send the container completed event to Task attempt
eventHandler.handle(createContainerFinishedEvent(cont, attemptID));
// Send the diagnostics
String diagnostics = StringInterner.weakIntern(cont.getDiagnostics());
eventHandler.handle(new TaskAttemptDiagnosticsUpdateEvent(attemptID,
diagnostics));
preemptionPolicy.handleCompletedContainer(attemptID);
}
...
}
{code}
The scenario in question is described as follows: A job is running, and one of
tasks attempt running on a NM is killed by the NM because the container exceeds
its resource limit. The container status/diagnostics is sent to RM by the NM
and then later to MR AM in its periodical heartbeat with RM as shown above. In
MR AM, the task attempt is still in RUNNING state from AM's perspective, since
the task heartbeat has not timed out.
Upon receiving from RM that the task attempt container has finished, the
RMCommunicator thread will place a ContainerFinishedEvent and a
TaskAttemptDiagnosticsUpdateEvent in the event queue.
The ContainerFinishedEvent will cause the task attempt in MR AM to transition
from RUNNING to FAILED and a TaskAttemptUnsuccessfulCompletionEvent that
contains the associated diagnostics information to be written to the .jhist
file. The TaskAttemptDiagnosticsUpdateEvent will update the diagnostics
information associated with the task attempt.
But since the ContainerFinishedEvent is placed and processed before the
TaskAttemptDiagnosticsUpdateEvent, the TaskAttemptUnsuccessfulCompletionEvent
written to .jhist file will not contain the diagnostics info received from RM.
After the job is completed, the user tries to access the failed task attempts
through JHS, the TaskAttemptUnsuccessfulCompletionEvent is parsed to generate
the failed attempt page. The page will not have diagnostics info from RM (such
as container killed by Node Manager...) because it was never written to .jhist
in the first place.
> Diagnostics information can be lost in .jhist if task containers are killed
> by Node Manager.
> --------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 2.7.3
> Reporter: Haibo Chen
> Assignee: Haibo Chen
>
> Task containers can go over their resource limit, and killed by Node Manager.
> Then MR AM gets notified of the container status and diagnostics information
> through its heartbeat with RM. However, it is possible that the diagnostics
> information never gets into .jhist file, so when the job completes, the
> diagnostics information associated with the failed task attempts is empty.
> This makes it hard for users to root cause job failures that are often caused
> by memory leak.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]