[ 
https://issues.apache.org/jira/browse/HADOOP-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624582#action_12624582
 ] 

Vinod Kumar Vavilapalli commented on HADOOP-3994:
-------------------------------------------------

Elsewhere on HADOOP-3581(yet to be committed), we used a different method of 
obtaining the pid of the launching task. For this, just before the task is 
launched, the launching shell prints out the pid to a pid file(echo $$ > 
pidfile), and then the task is exec'ed. Later this pid file is read and then is 
used for memory management of the process. I guess the same pidfile can be used 
for this issue too. This method works everywhere the shell feature works.

But I agree in general that a getPid() method is a good to have.

bq. [..] UNIXProcess is undocumented and only likely to surface on sun-derived 
JVMs; the other risk is instability of their private code [..]
We can write native code to avoid the above. But yes, this implies adding 
another native library.

As, a side note, the JAVA getPid() 
bug(http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4244896) is past 9 year 
celebrations :).

Also can we do anything similar to get more information when streaming/pipe 
tasks timeout too?

> There is little information provided when the TaskTracker kills a Task that 
> has not reported within the timeout (600 sec) interval - this patch provides 
> a stack trace of the task 
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3994
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3994
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Jason
>            Priority: Minor
>         Attachments: 0.16_patch
>
>
> When we have a task that is killed for not reporting, sometimes there is an 
> obvious programming error, and sometimes the reason the job didn't report is 
> unclear.
> This patch will cause the TaskTracker to try to generate a stack trace of the 
> offending task before the task is killed.
> Given how opaque process control is in java, a program is run to generate the 
> stack trace, using the PID extracted from the undocumented UNIXProcess class
> The attached patch is against 0.16.0, as that is the release we use.
> This will only work on Unix machines -- or JVM's what use the 
> java.lang.UNIXProcess implementation for the java Process object.
> The script that generates the stack trace is very linux specific.
> The code changes will run on jvm's where the UNIXProcess class is not 
> available, without failure, but no stack trace will be generated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to