On Tuesday, July 3, 2012 at 10:08 AM, Jay Wilson wrote:

> 2012-07-03 09:05:00,530 ERROR
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Couldn't close
> log at
> hdfs://devrackA-00:8020/var/hbase-hadoop/hbase/-ROOT-/70236052/recovered.edits/0000000000000000046.temp
> java.net.NoRouteToHostException: No route to host
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:429)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3567)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3522)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2300(DFSClient.java:2720)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2915)
> 2012-07-03 09:05:00,536 WARN
> org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting of
> [devrackA-03,60020,1341328322971, devrackA-04,60020,1341328322988,
> devrackA-05,60020,1341328322976]
> java.net.NoRouteToHostException: No route to host
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:429)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3567)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3522)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2300(DFSClient.java:2720)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2915)

NoRouteToHostException hints at some network trouble… Region servers aren't 
able to talk to the underlying Datanode processes. That would cause grief for 
sure.

Looks like you have been having network trouble from the ZK issues that you 
mentioned earlier too. Maybe your TOR is dropping packets or something like 
that? A ping won't tell you that though. Do you have any sort of monitoring in 
place that can give you insight into how the network is performing?

Reply via email to