[ 
https://issues.apache.org/jira/browse/HADOOP-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nigel Daley resolved HADOOP-2108.
---------------------------------

       Resolution: Duplicate
    Fix Version/s: 0.14.3

Duplicate of HADOOP-2036 that was fixed in 0.14.3

> NullPointerException in JVMMetrics for OOM killed task
> ------------------------------------------------------
>
>                 Key: HADOOP-2108
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2108
>             Project: Hadoop
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.14.2
>         Environment: Centos5 jdk1.6.0_02
>            Reporter: Richard Lee
>            Priority: Minor
>             Fix For: 0.14.3
>
>
> I had a reduce task run out of memory and die in such a way that 
> JVMMetrics.doThreadUpdates() throws a NullPointerException.
> The aparent cause seems to be that the call to threadMXBean.getThreadInfo() 
> on JVMMetrics:119 returns an array of ThreadInfo whose elements may be null.
> Here's a relevant quote from the javadoc:
> This method returns an array of the ThreadInfo objects,
>      * each is the thread information about the thread with the same index
>      * as in the ids array.
>      * If a thread of the given ID is not alive or does not exist,
>      * null will be set in the corresponding element 
>      * in the returned array.  A thread is alive if 
>      * it has been started and has not yet died.
> My stacktrace looks like this:
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.metrics.jvm.JvmMetrics.doThreadUpdates(JvmMetrics.java:129)
>       at 
> org.apache.hadoop.metrics.jvm.JvmMetrics.doUpdates(JvmMetrics.java:79)
>       at 
> org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:284)
>       at 
> org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:50)
>       at 
> org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:249)
>       at java.util.TimerThread.mainLoop(Timer.java:512)
>       at java.util.TimerThread.run(Timer.java:462)
> On line 129,  there's an attempt to dereference the potientially null 
> threadInfo value to get its current state.
> The naive solution here is to check for null and count null values as 
> "terminated"... but it seems clear that a thread state of TERMINATED and a 
> null ThreadInfo value are distinct cases and may need special treatment.
> Guessing that this is a "minor" issue because it seems more cosmetic than 
> mission critical.  I'm not sure what the upstream effects are of this method 
> throwing the NPE, so i didn't set it to "trivial".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to