0.20.2-cdh3u2 -- "add to deadNodes and continue" would solve this issue. For some reason its not getting into this code path.
If its a matter of adding a quick line of code to make this work, then we would rather recompile with that and upgrade later when we have better backup. -Jack On Thu, Feb 13, 2014 at 10:55 PM, Stack <[email protected]> wrote: > On Thu, Feb 13, 2014 at 9:18 PM, Jack Levin <[email protected]> wrote: > > > One other question, we get this: > > > > 2014-02-13 02:46:12,768 WARN org.apache.hadoop.hdfs.DFSClient: Failed to > > connect to /10.101.5.5:50010 for file > > /hbase/img32/b97657bfcbf922045d96315a4ada0782/att/4890606694307129591 for > > block -9099107892773428976:java.net.SocketTimeoutException: 60000 millis > > timeout while waiting for channel to be ready for connect. ch : > > java.nio.channels.SocketChannel[connection-pending remote=/ > > 10.101.5.5:50010] > > > > > > Why can't RS do this instead: > > > > > hbase-root-regionserver-mtab5.prod.imageshack.com.log.2014-02-10:2014-02-10 > > 22:05:11,763 INFO org.apache.hadoop.hdfs.DFSClient: Failed to connect to > / > > 10.103.8.109:50010, add to deadNodes and continue > > > > "add to deadNodes and continue" specifically? > > > > > The regionserver runs on the HDFS API. The implementations can vary. The > management of nodes -- their coming and going -- is done inside the HDFS > client code. The regionserver is insulated from all that goes on therein. > > What version of HDFS are you on Jack? > > St.Ack >
