[
https://issues.apache.org/jira/browse/HDFS-15219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated HDFS-15219:
-------------------------------
Description:
In my case, a Tez application stucked more than 2 hours util we kill this
applicaiton. The Reason is a task attempt stucked, becuase speculative
execution is disable.
Then Exception like this:
{code:java}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records
read - 100000
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]:
records written - 1000000
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records
read - 1000000
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073]
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main]
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
at
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
at java.lang.String.valueOf(String.java:2847)
at java.lang.StringBuilder.append(StringBuilder.java:128)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
at java.util.zip.ZipFile.read(Native Method)
at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at sun.misc.Resource.getBytes(Resource.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
... 10 more
2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073]
|util.ExitUtil|: Exiting with status -1
2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|:
Received should die response from AM
2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Asked
to die via task heartbeat
2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|:
Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an
invocation of shutdownRequested
{code}
was:
In my case, a Tez application stucked more than 2 hours util we kill this
applicaiton. The Reason is a task attempt stucked, becuase speculative
execution is disable.
Then Exception like this:
{code}
2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records
read - 100000
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]:
records written - 1000000
2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records
read - 1000000
2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073]
|yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for block
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main]
threw an Error. Shutting down now...
java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
at
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
at java.lang.String.valueOf(String.java:2847)
at java.lang.StringBuilder.append(StringBuilder.java:128)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 4 more
Caused by: java.util.zip.ZipException: error reading zip file
at java.util.zip.ZipFile.read(Native Method)
at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at sun.misc.Resource.getBytes(Resource.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
... 10 more
2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block
BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073]
|util.ExitUtil|: Exiting with status -1
2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|:
Received should die response from AM
2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Asked
to die via task heartbeat
2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|:
Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an
invocation of shutdownRequested
{code}
> DFS Client will stuck when ResponseProcessor.run throw Error
> ------------------------------------------------------------
>
> Key: HDFS-15219
> URL: https://issues.apache.org/jira/browse/HDFS-15219
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 2.7.3
> Reporter: zhengchenyu
> Priority: Major
> Fix For: 3.2.2
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> In my case, a Tez application stucked more than 2 hours util we kill this
> applicaiton. The Reason is a task attempt stucked, becuase speculative
> execution is disable.
> Then Exception like this:
> {code:java}
> 2020-03-11 01:23:59,141 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records
> read - 100000
> 2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.FileSinkOperator|: FS[3]:
> records written - 1000000
> 2020-03-11 01:24:50,294 [INFO] [TezChild] |exec.MapOperator|: MAP[4]: records
> read - 1000000
> 2020-03-11 01:29:02,967 [FATAL] [ResponseProcessor for block
> BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073]
> |yarn.YarnUncaughtExceptionHandler|: Thread Thread[ResponseProcessor for
> block
> BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073,5,main]
> threw an Error. Shutting down now...
> java.lang.NoClassDefFoundError: com/google/protobuf/TextFormat
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.toString(PipelineAck.java:253)
> at java.lang.String.valueOf(String.java:2847)
> at java.lang.StringBuilder.append(StringBuilder.java:128)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:737)
> Caused by: java.lang.ClassNotFoundException: com.google.protobuf.TextFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 4 more
> Caused by: java.util.zip.ZipException: error reading zip file
> at java.util.zip.ZipFile.read(Native Method)
> at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
> at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
> at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:415)
> at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
> at sun.misc.Resource.getBytes(Resource.java:124)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:444)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> ... 10 more
> 2020-03-11 01:29:02,970 [INFO] [ResponseProcessor for block
> BP-1856561198-172.16.6.67-1421842461517:blk_15177828027_14109212073]
> |util.ExitUtil|: Exiting with status -1
> 2020-03-11 03:27:26,833 [INFO] [TaskHeartbeatThread] |task.TaskReporter|:
> Received should die response from AM
> 2020-03-11 03:27:26,834 [INFO] [TaskHeartbeatThread] |task.TaskReporter|:
> Asked to die via task heartbeat
> 2020-03-11 03:27:26,839 [INFO] [TaskHeartbeatThread] |task.TezTaskRunner2|:
> Attempting to abort attempt_1583335296048_917815_3_01_000704_0 due to an
> invocation of shutdownRequested
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]