[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776896#action_12776896
 ] 

Vinod K V commented on MAPREDUCE-1119:
--------------------------------------

We can avoid the uncertainity with dumping stack in case of Child exception by 
directly calling {{taskController.dumpTaskStack(context)}} from inside 
{{TaskTracker.markUnresponsiveTasks()}} immediately before 
{{purgeTask(tip,true)}}. This will create the dump only when it is absolutely 
needed.

Would that work? To construct the context, you may need a bridging method 
inside {{JvmManager}} which can itself call 
{{taskController.dumpTaskStack(context)}}.

That will make the patch a lot simpler, and will avoid many other avoidable 
changes to {{JmvManager}}/{{TaskController}}. If you wish, I can upload a 
demonstrating patch. Thoughts? 

> When tasks fail to report status, show tasks's stack dump before killing
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1119
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1119
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-1119.2.patch, MAPREDUCE-1119.3.patch, 
> MAPREDUCE-1119.4.patch, MAPREDUCE-1119.patch
>
>
> When the TT kills tasks that haven't reported status, it should somehow 
> gather a stack dump for the task. This could be done either by sending a 
> SIGQUIT (so the dump ends up in stdout) or perhaps something like JDI to 
> gather the stack directly from Java. This may be somewhat tricky since the 
> child may be running as another user (so the SIGQUIT would have to go through 
> LinuxTaskController). This feature would make debugging these kinds of 
> failures much easier, especially if we could somehow get it into the 
> TaskDiagnostic message

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to