There is little information provided when the TaskTracker kills a Task that has
not reported with the timeout (600 sec) interval - this patch provides a stack
trace of the task
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key: HADOOP-3994
URL: https://issues.apache.org/jira/browse/HADOOP-3994
Project: Hadoop Core
Issue Type: New Feature
Components: mapred
Affects Versions: 0.16.0
Reporter: Jason
Priority: Minor
Attachments: 0.16_patch
When we have a task that is killed for not reporting, sometimes there is an
obvious programming error, and sometimes the reason the job didn't report is
unclear.
This patch will cause the TaskTracker to try to generate a stack trace of the
offending task before the task is killed.
Given how opaque process control is in java, a program is run to generate the
stack trace, using the PID extracted from the undocumented UNIXProcess class
The attached patch is against 0.16.0, as that is the release we use.
This will only work on Unix machines -- or JVM's what use the
java.lang.UNIXProcess implementation for the java Process object.
The script that generates the stack trace is very linux specific.
The code changes will run on jvm's where the UNIXProcess class is not
available, without failure, but no stack trace will be generated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.