[ 
https://issues.apache.org/jira/browse/PIG-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552324#comment-14552324
 ] 

Hari Sekhon commented on PIG-4418:
----------------------------------

I'm also getting hit by this with Pig on Tez while running large multi-hour 
jobs, each job is identical processing a single day's logs (hundreds of 
millions of records) and is parameterized changing only the date of the day to 
process on each run. Some jobs fail with this error a short way through their 
long runtime, it's not clear why some succeed and some fail intermittently 
however, here is an example failure:
{code}2015-05-20 14:13:41,905 [Timer-0] INFO  
org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
status=RUNNING, progress=TotalTasks: 2002 Succeeded: 50 Running: 67 Failed: 0 
Killed: 0, diagnostics=
2015-05-20 14:13:55,102 [PigTezLauncher-0] INFO  
org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
status=FAILED, progress=TotalTasks: 2002 Succeeded: 50 Running: 0 Failed: 1 
Killed: 1951 FailedTaskAttempts: 1, diagnostics=Vertex failed, 
vertexName=scope-26, vertexId=vertex_1432022048010_0042_1_00, diagnostics=[Task 
failed, taskId=task_1432022048010_0042_1_00_000044, diagnostics=[TaskAttempt 0 
failed, info=[Error: Failure while running task:java.lang.NullPointerException
        at org.apache.pig.JVMReuseImpl.cleanupStaticData(JVMReuseImpl.java:44)
        at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.close(PigProcessor.java:174)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:334)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:178)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
vertex_1432022048010_0042_1_00 [scope-26] killed/failed due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0, 
counters=Counters: 16
        org.apache.tez.common.counters.DAGCounter
                NUM_FAILED_TASKS=1
                NUM_KILLED_TASKS=1951
                NUM_SUCCEEDED_TASKS=50
                TOTAL_LAUNCHED_TASKS=117
        File System Counters
                HDFS_BYTES_READ=5604569192
                HDFS_BYTES_WRITTEN=0
                HDFS_READ_OPS=50
                HDFS_LARGE_READ_OPS=0
                HDFS_WRITE_OPS=0
        org.apache.tez.common.counters.TaskCounter
                GC_TIME_MILLIS=84086
                CPU_MILLISECONDS=11549340
                PHYSICAL_MEMORY_BYTES=20238544896
                VIRTUAL_MEMORY_BYTES=178987933696
                COMMITTED_HEAP_BYTES=105483599872
                INPUT_RECORDS_PROCESSED=30867702
                OUTPUT_RECORDS=30867652
{code}

> NullPointerException in JVMReuseImpl
> ------------------------------------
>
>                 Key: PIG-4418
>                 URL: https://issues.apache.org/jira/browse/PIG-4418
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Jeff Zhang
>            Assignee: Rohini Palaniswamy
>            Priority: Critical
>             Fix For: 0.15.0
>
>         Attachments: PIG-4418-1.patch
>
>
> {code}
> 2015-02-13 15:17:11,067 INFO [TezChild] task.TezTaskRunner: Encounted an 
> error while executing task: attempt_1423730493153_0019_1_04_000002_0
> java.lang.NullPointerException
>       at org.apache.pig.JVMReuseImpl.cleanupStaticData(JVMReuseImpl.java:46)
>       at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.close(PigProcessor.java:175)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:338)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:171)
>       at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:166)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to