[ 
https://issues.apache.org/jira/browse/PIG-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003293#comment-15003293
 ] 

Rohini Palaniswamy commented on PIG-4418:
-----------------------------------------

With the log messages added in this jira, found the problem is that the  Method 
object added in instance.cleanupMethods.add(method); becomes null after 
sometime and so invoking them hits NPE in 
https://git.corp.yahoo.com/hadoop/pig/blob/trunk/src/org/apache/pig/JVMReuseImpl.java.
 I have not been able to figure out why that would become null and could not 
find anything online on why that can happen. Cannot skip calling clean up on 
those methods, as most of them are from key classes in Pig and I doubt that 
those classes went out of scope and they were unloaded. So need to let that 
task attempt fail and rerun (Tez does not reuse containers that hit an 
exception). Not sure if GC has anything to do with it. No 
-XX:+CMSClassUnloadingEnabled is set for classes to be unloaded. Have seen it 
with ParallelGC in JDK 7 and also JDK 8. So could not attribute it to PermGen 
space refactoring done with JDK 8 either.

First time I saw the error message, 
org.apache.pig.impl.util.UDFContext.cleanupStaticData had become null. In 
couple of recent cases, org.apache.pig.impl.PigContext.staticDataCleanup had 
become null.

{code}
2015-08-13 20:21:29,646 ERROR [TezChild] pig.JVMReuseImpl: Exception while 
calling static 
methods:null,org.apache.pig.impl.PigContext.staticDataCleanup,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup.
 null

2015-11-03 14:27:17,672 [ERROR] [TezChild] |pig.JVMReuseImpl|: Exception while 
calling static 
methods:null,org.apache.pig.impl.util.UDFContext.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup.
 null

2015-11-12 17:21:53,577 [ERROR] [TezChild] |pig.JVMReuseImpl|: Exception while 
calling static 
methods:null,org.apache.pig.impl.util.UDFContext.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup.
 null



Full list in a normal run:
org.apache.pig.impl.PigContext.staticDataCleanup,org.apache.pig.impl.util.UDFContext.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup
{code}

One common thing I see with the errors is that the first element became null in 
both cases.

For now, filed PIG-4733 to at least call cleanup of builtin classes directly 
and avoid reflection as that will cover the regular run case and avoid this 
issue.

> NullPointerException in JVMReuseImpl
> ------------------------------------
>
>                 Key: PIG-4418
>                 URL: https://issues.apache.org/jira/browse/PIG-4418
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Jeff Zhang
>            Assignee: Rohini Palaniswamy
>            Priority: Critical
>             Fix For: 0.15.0
>
>         Attachments: PIG-4418-1.patch, PIG-4418-2.patch
>
>
> {code}
> 2015-02-13 15:17:11,067 INFO [TezChild] task.TezTaskRunner: Encounted an 
> error while executing task: attempt_1423730493153_0019_1_04_000002_0
> java.lang.NullPointerException
>       at org.apache.pig.JVMReuseImpl.cleanupStaticData(JVMReuseImpl.java:46)
>       at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.close(PigProcessor.java:175)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:338)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:171)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:166)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to