[ https://issues.apache.org/jira/browse/HDFS-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147443#comment-14147443 ]
Jianshi Huang commented on HDFS-6999: ------------------------------------- I'm having the same issue. And it's reproducible. My stacktrace looks like this: "Executor task launch worker-3" java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:257) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.readChannelFully(PacketReceiver.java:258) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:209) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:171) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102) at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:173) at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:138) at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:683) at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:739) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:796) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837) at java.io.DataInputStream.readFully(DataInputStream.java:195) at java.io.DataInputStream.readFully(DataInputStream.java:169) at parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(ParquetFileReader.java:599) at parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:360) at parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:100) at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:172) at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:130) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:139) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) I'm running Spark 1.1.0 on HDP 2.1 (HDFS 2.4.0), the task reads a bunch of parquet files. Jianshi > PacketReceiver#readChannelFully is in an infinite loop > ------------------------------------------------------ > > Key: HDFS-6999 > URL: https://issues.apache.org/jira/browse/HDFS-6999 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs-client > Affects Versions: 2.4.1 > Reporter: Yang Jiandan > Priority: Critical > > In our cluster, we found hbase handler may be never return when it reads hdfs > file using RemoteBlockReader2, and the hander thread occupys 100% cup. wo > found this is because PacketReceiver#readChannelFully is in an infinite loop. > the following while never break. > {code:xml} > while (buf.remaining() > 0) { > int n = ch.read(buf); > if (n < 0) { > throw new IOException("Premature EOF reading from " + ch); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)