The error message in the datanode log is pretty obvious about the config if you really hit it. The error message you pasted from the DN doesn't look complete either.
J-D On Wed, May 11, 2011 at 7:30 AM, Stanley Xu <[email protected]> wrote: > Dear all, > > We are using hadoop 0.20.2 with a couple of patches, and hbase 0.20.6, when > we are running a MapReduce job which contains a lots of random access to a > hbase table. We met a lot of logs like the following at the same time in the > region server and data node: > > For RegionServer: > "INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block > blk_7212216405058183301_3974453 from any node: java.io.IOException: No live > nodes contain current block" > "WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to / > 10.0.2.44:50010 for file > /hbase/CookieTag/197333923/VisitStrength/151799904199528367 for block > 7212216405058183301:java.io.IOException: Got error in response to > OP_READ_BLOCK for file > /hbase/CookieTag/197333923/VisitStrength/151799904199528367 for block > 7212216405058183301" and > > For DataNode: > "ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(10.0.2.26:50010, > storageID=DS-1332752738-192.168.11.99-50010-1285486780176, infoPort=50075, > ipcPort=50020):DataXceiver > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)" > > We changed the dfs.datanode.max.xcievers parameters to 4096 in the > hdfs-site.xml in both hadoop and hdfs configuration. But when we use a > VisualVM to connect to a data node, we found there is less that 100 > threads(close to 100, we count 97 in a thread dump, I am guessing the lost 3 > is finished or just started) DataXceiver thread. (Thread dump logs like the > following) > > "org.apache.hadoop.hdfs.server.datanode.DataXceiver@3546286a" - Thread > t@2941003 > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)" > > I guess our setting in the hdfs-site.xml didn't really work or be activated. > We have restarted the hadoop cluster by stop-dfs.sh and start-dfs.sh and > also restarted the hbase as well. > I am wondering if anyone could tell me how could I make sure the xceiver > parameters works or anything I should do except restart the dfs and hbase? > Could I do any check in the web interface or anywhere else? > > Thanks in advance. > > Best wishes, > Stanley Xu >
