[
https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916964#comment-13916964
]
Ming Ma commented on MAPREDUCE-5465:
------------------------------------
I discussed with Ravi offline and will provide the patch for review soon.
The basic approach is to define a new state called FINISHING_CONTAINER for
TaskAttemptStateInternal. TaskAttempt will transition to this new state after
it receives TaskUmbilicalProtocol's done notification from the task JVM. This
will give a chance for the container to exit by itself. Normally the attempt
will receive container exit notification via NM -> RM -> AM route; if it
doesn't get the notification in time, it will time out and clean up the
container via stopContainer.
> Container killed before hprof dumps profile.out
> -----------------------------------------------
>
> Key: MAPREDUCE-5465
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mr-am, mrv2
> Affects Versions: 2.0.3-alpha
> Reporter: Radim Kolar
> Assignee: Ming Ma
> Attachments: MAPREDUCE-5465.patch
>
>
> If there is profiling enabled for mapper or reducer then hprof dumps
> profile.out at process exit. It is dumped after task signaled to AM that work
> is finished.
> AM kills container with finished work without waiting for hprof to finish
> dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3
> works) , it could not finish dump in time before being killed making entire
> dump unusable because cpu and heap stats are missing.
> There needs to be better delay before container is killed if profiling is
> enabled.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)