Dear all, We are using hadoop 0.20.2 with a couple of patches, and hbase 0.20.6, when we are running a MapReduce job which contains a lots of random access to a hbase table. We met a lot of logs like the following at the same time in the region server and data node:
For RegionServer: "INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_7212216405058183301_3974453 from any node: java.io.IOException: No live nodes contain current block" "WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to / 10.0.2.44:50010 for file /hbase/CookieTag/197333923/VisitStrength/151799904199528367 for block 7212216405058183301:java.io.IOException: Got error in response to OP_READ_BLOCK for file /hbase/CookieTag/197333923/VisitStrength/151799904199528367 for block 7212216405058183301" and For DataNode: "ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.0.2.26:50010, storageID=DS-1332752738-192.168.11.99-50010-1285486780176, infoPort=50075, ipcPort=50020):DataXceiver at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)" We changed the dfs.datanode.max.xcievers parameters to 4096 in the hdfs-site.xml in both hadoop and hdfs configuration. But when we use a VisualVM to connect to a data node, we found there is less that 100 threads(close to 100, we count 97 in a thread dump, I am guessing the lost 3 is finished or just started) DataXceiver thread. (Thread dump logs like the following) "org.apache.hadoop.hdfs.server.datanode.DataXceiver@3546286a" - Thread t@2941003 java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)" I guess our setting in the hdfs-site.xml didn't really work or be activated. We have restarted the hadoop cluster by stop-dfs.sh and start-dfs.sh and also restarted the hbase as well. I am wondering if anyone could tell me how could I make sure the xceiver parameters works or anything I should do except restart the dfs and hbase? Could I do any check in the web interface or anywhere else? Thanks in advance. Best wishes, Stanley Xu
