[
https://issues.apache.org/jira/browse/PIG-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552324#comment-14552324
]
Hari Sekhon commented on PIG-4418:
----------------------------------
I'm also getting hit by this with Pig on Tez while running large multi-hour
jobs, each job is identical processing a single day's logs (hundreds of
millions of records) and is parameterized changing only the date of the day to
process on each run. Some jobs fail with this error a short way through their
long runtime, it's not clear why some succeed and some fail intermittently
however, here is an example failure:
{code}2015-05-20 14:13:41,905 [Timer-0] INFO
org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status:
status=RUNNING, progress=TotalTasks: 2002 Succeeded: 50 Running: 67 Failed: 0
Killed: 0, diagnostics=
2015-05-20 14:13:55,102 [PigTezLauncher-0] INFO
org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status:
status=FAILED, progress=TotalTasks: 2002 Succeeded: 50 Running: 0 Failed: 1
Killed: 1951 FailedTaskAttempts: 1, diagnostics=Vertex failed,
vertexName=scope-26, vertexId=vertex_1432022048010_0042_1_00, diagnostics=[Task
failed, taskId=task_1432022048010_0042_1_00_000044, diagnostics=[TaskAttempt 0
failed, info=[Error: Failure while running task:java.lang.NullPointerException
at org.apache.pig.JVMReuseImpl.cleanupStaticData(JVMReuseImpl.java:44)
at
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.close(PigProcessor.java:174)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:334)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:178)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex
vertex_1432022048010_0042_1_00 [scope-26] killed/failed due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0,
counters=Counters: 16
org.apache.tez.common.counters.DAGCounter
NUM_FAILED_TASKS=1
NUM_KILLED_TASKS=1951
NUM_SUCCEEDED_TASKS=50
TOTAL_LAUNCHED_TASKS=117
File System Counters
HDFS_BYTES_READ=5604569192
HDFS_BYTES_WRITTEN=0
HDFS_READ_OPS=50
HDFS_LARGE_READ_OPS=0
HDFS_WRITE_OPS=0
org.apache.tez.common.counters.TaskCounter
GC_TIME_MILLIS=84086
CPU_MILLISECONDS=11549340
PHYSICAL_MEMORY_BYTES=20238544896
VIRTUAL_MEMORY_BYTES=178987933696
COMMITTED_HEAP_BYTES=105483599872
INPUT_RECORDS_PROCESSED=30867702
OUTPUT_RECORDS=30867652
{code}
> NullPointerException in JVMReuseImpl
> ------------------------------------
>
> Key: PIG-4418
> URL: https://issues.apache.org/jira/browse/PIG-4418
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.14.0
> Reporter: Jeff Zhang
> Assignee: Rohini Palaniswamy
> Priority: Critical
> Fix For: 0.15.0
>
> Attachments: PIG-4418-1.patch
>
>
> {code}
> 2015-02-13 15:17:11,067 INFO [TezChild] task.TezTaskRunner: Encounted an
> error while executing task: attempt_1423730493153_0019_1_04_000002_0
> java.lang.NullPointerException
> at org.apache.pig.JVMReuseImpl.cleanupStaticData(JVMReuseImpl.java:46)
> at
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.close(PigProcessor.java:175)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:338)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:171)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:166)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)