[
https://issues.apache.org/jira/browse/PIG-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003293#comment-15003293
]
Rohini Palaniswamy commented on PIG-4418:
-----------------------------------------
With the log messages added in this jira, found the problem is that the Method
object added in instance.cleanupMethods.add(method); becomes null after
sometime and so invoking them hits NPE in
https://git.corp.yahoo.com/hadoop/pig/blob/trunk/src/org/apache/pig/JVMReuseImpl.java.
I have not been able to figure out why that would become null and could not
find anything online on why that can happen. Cannot skip calling clean up on
those methods, as most of them are from key classes in Pig and I doubt that
those classes went out of scope and they were unloaded. So need to let that
task attempt fail and rerun (Tez does not reuse containers that hit an
exception). Not sure if GC has anything to do with it. No
-XX:+CMSClassUnloadingEnabled is set for classes to be unloaded. Have seen it
with ParallelGC in JDK 7 and also JDK 8. So could not attribute it to PermGen
space refactoring done with JDK 8 either.
First time I saw the error message,
org.apache.pig.impl.util.UDFContext.cleanupStaticData had become null. In
couple of recent cases, org.apache.pig.impl.PigContext.staticDataCleanup had
become null.
{code}
2015-08-13 20:21:29,646 ERROR [TezChild] pig.JVMReuseImpl: Exception while
calling static
methods:null,org.apache.pig.impl.PigContext.staticDataCleanup,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup.
null
2015-11-03 14:27:17,672 [ERROR] [TezChild] |pig.JVMReuseImpl|: Exception while
calling static
methods:null,org.apache.pig.impl.util.UDFContext.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup.
null
2015-11-12 17:21:53,577 [ERROR] [TezChild] |pig.JVMReuseImpl|: Exception while
calling static
methods:null,org.apache.pig.impl.util.UDFContext.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup.
null
Full list in a normal run:
org.apache.pig.impl.PigContext.staticDataCleanup,org.apache.pig.impl.util.UDFContext.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.staticDataCleanup,org.apache.pig.impl.util.SpillableMemoryManager.cleanupStaticData,org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.staticDataCleanup,org.apache.pig.tools.pigstats.PigStatusReporter.staticDataCleanup
{code}
One common thing I see with the errors is that the first element became null in
both cases.
For now, filed PIG-4733 to at least call cleanup of builtin classes directly
and avoid reflection as that will cover the regular run case and avoid this
issue.
> NullPointerException in JVMReuseImpl
> ------------------------------------
>
> Key: PIG-4418
> URL: https://issues.apache.org/jira/browse/PIG-4418
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.14.0
> Reporter: Jeff Zhang
> Assignee: Rohini Palaniswamy
> Priority: Critical
> Fix For: 0.15.0
>
> Attachments: PIG-4418-1.patch, PIG-4418-2.patch
>
>
> {code}
> 2015-02-13 15:17:11,067 INFO [TezChild] task.TezTaskRunner: Encounted an
> error while executing task: attempt_1423730493153_0019_1_04_000002_0
> java.lang.NullPointerException
> at org.apache.pig.JVMReuseImpl.cleanupStaticData(JVMReuseImpl.java:46)
> at
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.close(PigProcessor.java:175)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:338)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:171)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:166)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)