[ https://issues.apache.org/jira/browse/MAPREDUCE-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776008#action_12776008 ]
Doug Cutting commented on MAPREDUCE-1119: ----------------------------------------- > Is all this worth it just to avoid stack dumps in aborted speculative task > attempts? Probably so, although I've not looked at the details. A task timing out is a bad thing. A speculative task getting killed is normal. One should be able to easily scan a job's log for bad things. These are all internal APIs, right? If so, it shouldn't create problems to change them, should it? > When tasks fail to report status, show tasks's stack dump before killing > ------------------------------------------------------------------------ > > Key: MAPREDUCE-1119 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1119 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: Aaron Kimball > Attachments: MAPREDUCE-1119.2.patch, MAPREDUCE-1119.patch > > > When the TT kills tasks that haven't reported status, it should somehow > gather a stack dump for the task. This could be done either by sending a > SIGQUIT (so the dump ends up in stdout) or perhaps something like JDI to > gather the stack directly from Java. This may be somewhat tricky since the > child may be running as another user (so the SIGQUIT would have to go through > LinuxTaskController). This feature would make debugging these kinds of > failures much easier, especially if we could somehow get it into the > TaskDiagnostic message -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.