There is little information provided when the TaskTracker kills a Task that has 
not reported with the timeout (600 sec) interval - this patch provides a stack 
trace of the task 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-3994
                 URL: https://issues.apache.org/jira/browse/HADOOP-3994
             Project: Hadoop Core
          Issue Type: New Feature
          Components: mapred
    Affects Versions: 0.16.0
            Reporter: Jason
            Priority: Minor
         Attachments: 0.16_patch

When we have a task that is killed for not reporting, sometimes there is an 
obvious programming error, and sometimes the reason the job didn't report is 
unclear.
This patch will cause the TaskTracker to try to generate a stack trace of the 
offending task before the task is killed.
Given how opaque process control is in java, a program is run to generate the 
stack trace, using the PID extracted from the undocumented UNIXProcess class

The attached patch is against 0.16.0, as that is the release we use.
This will only work on Unix machines -- or JVM's what use the 
java.lang.UNIXProcess implementation for the java Process object.
The script that generates the stack trace is very linux specific.
The code changes will run on jvm's where the UNIXProcess class is not 
available, without failure, but no stack trace will be generated.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to