[
https://issues.apache.org/jira/browse/HDFS-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502020#comment-14502020
]
Steve Loughran commented on HDFS-8160:
--------------------------------------
>From the stack trace
{code}
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while
waiting for channel to be ready for connect. ch :
java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
{code}
our server at 10.40.8.10 appears to be down or unreachable.
> Long delays when calling hdfsOpenFile()
> ---------------------------------------
>
> Key: HDFS-8160
> URL: https://issues.apache.org/jira/browse/HDFS-8160
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: libhdfs
> Affects Versions: 2.5.2
> Environment: 3-node Apache Hadoop 2.5.2 cluster running on Ubuntu
> 14.04
> dfshealth overview:
> Security is off.
> Safemode is off.
> 8 files and directories, 9 blocks = 17 total filesystem object(s).
> Heap Memory used 45.78 MB of 90.5 MB Heap Memory. Max Heap Memory is 889 MB.
> Non Heap Memory used 36.3 MB of 70.44 MB Commited Non Heap Memory. Max Non
> Heap Memory is 130 MB.
> Configured Capacity: 118.02 GB
> DFS Used: 2.77 GB
> Non DFS Used: 12.19 GB
> DFS Remaining: 103.06 GB
> DFS Used%: 2.35%
> DFS Remaining%: 87.32%
> Block Pool Used: 2.77 GB
> Block Pool Used%: 2.35%
> DataNodes usages% (Min/Median/Max/stdDev): 2.35% / 2.35% / 2.35% / 0.00%
> Live Nodes 3 (Decommissioned: 0)
> Dead Nodes 0 (Decommissioned: 0)
> Decommissioning Nodes 0
> Number of Under-Replicated Blocks 0
> Number of Blocks Pending Deletion 0
> Datanode Information
> In operation
> Node Last contact Admin State Capacity Used Non DFS Used
> Remaining Blocks Block pool used Failed Volumes Version
> hadoop252-3 (x.x.x.10:50010) 1 In Service 39.34 GB 944.85
> MB 3.63 GB 34.79 GB 9 944.85 MB (2.35%) 0 2.5.2
> hadoop252-1 (x.x.x.8:50010) 0 In Service 39.34 GB 944.85
> MB 4.94 GB 33.48 GB 9 944.85 MB (2.35%) 0 2.5.2
> hadoop252-2 (x.x.x.9:50010) 1 In Service 39.34 GB 944.85
> MB 3.63 GB 34.79 GB 9 944.85 MB (2.35%) 0 2.5.2
> java version "1.7.0_76"
> Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
> Reporter: Rod
>
> Calling hdfsOpenFile on a file residing on target 3-node Hadoop cluster
> (described in detail in Environment section) blocks for a long time (several
> minutes). I've noticed that the delay is related to the size of the target
> file.
> For example, attempting to hdfsOpenFile() on a file of filesize 852483361
> took 121 seconds, but a file of 15458 took less than a second.
> Also, during the long delay, the following stacktrace is routed to standard
> out:
> 2015-04-16 10:32:13,943 WARN [main] hdfs.BlockReaderFactory
> (BlockReaderFactory.java:getRemoteBlockReaderFromTcp(693)) - I/O error
> constructing remote block reader.
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while
> waiting for channel to be ready for connect. ch :
> java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
> at
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
> at
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
> at
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:854)
> at
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:143)
> 2015-04-16 10:32:13,946 WARN [main] hdfs.DFSClient
> (DFSInputStream.java:blockSeekTo(612)) - Failed to connect to
> /10.40.8.10:50010 for block, add to deadNodes and continue.
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while
> waiting for channel to be ready for connect. ch :
> java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while
> waiting for channel to be ready for connect. ch :
> java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
> at
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
> at
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
> at
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:854)
> at
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:143)
> I have also seen similar delays and stacktrace printout when executing dfs CL
> commands on those same files (df -cat, df -tail, etc.).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)