If you have block caching turned off this would be expected... What does hbase-site.xml look like?
On Wed, Apr 7, 2010 at 7:49 PM, Raghu Angadi <rang...@apache.org> wrote: > We are working with a small HBase cluster (5 nodes) with fairly beefy nodes. > While looking at why all the regionservers died at one time, noticed that > these servers read some files 100s of times a second. This may not be cause > of the error... but do you think this is odd? > > HBase version : 0.20.1. The cluster was handling mainly write traffic. > Note that in datanode log, there are a lot of reads these files. > > One of RS logs: > --- > 2010-04-06 21:51:33,923 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_OPEN: > campaign,4522\x234\x23201003,1268865840941 > 2010-04-06 21:51:34,211 INFO org.apache.hadoop.hbase.regionserver.HRegion: > region campaign,4522\x234\x23201003,1268865840941/407724784 available; > sequence id is 1607026498 > 2010-04-06 21:51:43,327 INFO org.apache.hadoop.hdfs.DFSClient: Could not > obtain block blk_8972126557191254374_1090962 from any node: > java.io.IOException: No live nodes contain current block > 2010-04-06 21:51:43,328 INFO org.apache.hadoop.hdfs.DFSClient: Could not > obtain block blk_-5586169098563059270_1078171 from any node: > java.io.IOException: No live nodes contain current block > 2010-04-06 21:51:43,328 INFO org.apache.hadoop.hdfs.DFSClient: Could not > obtain block blk_-7610953303919156937_1089667 from any node: > java.io.IOException: No live nodes contain current block > [...] > ---- > > portion of grep for one the blocks mentioned above in datanode log : > ---- > 39725703, offset: 0, srvID: DS-977430382-10.10.0.72-50010-1266601998020, > blockid: blk_8972126557191254374_1090962, duration: 97000 > 2010-04-06 21:51:43,307 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 10.10.0.72:50010, dest: /10.10.0.72:43699, bytes: 6976, op: HDFS_READ, > cliID: DFSClient_-1439725703, offset: 0, srvID: > DS-977430382-10.10.0.72-50010-1266601998020, blockid: > blk_8972126557191254374_1090962, duration: 76000 > 2010-04-06 21:51:43,310 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 10.10.0.72:50010, dest: /10.10.0.72:45123, bytes: 6976, op: HDFS_READ, > cliID: DFSClient_-1439725703, offset: 0, srvID: > DS-977430382-10.10.0.72-50010-1266601998020, blockid: > blk_8972126557191254374_1090962, duration: 93000 > 2010-04-06 21:51:43,314 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 10.10.0.72:50010, dest: /10.10.0.72:41891, bytes: 6976, op: HDFS_READ, > cliID: DFSClient_-1439725703, offset: 0, srvID: > DS-977430382-10.10.0.72-50010-1266601998020, blockid: > blk_8972126557191254374_1090962, duration: 267000 > 2010-04-06 21:51:43,318 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 10.10.0.72:50010, dest: /10.10.0.72:46412, bytes: 6976, op: HDFS_READ, > cliID: DFSClient_-1439725703, offset: 0, srvID: > DS-977430382-10.10.0.72-50010-1266601998020, blockid: > blk_8972126557191254374_1090962, duration: 91000 > 2010-04-06 21:51:46,330 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 10.10.0.72:50010, dest: /10.10.0.72:40657, bytes: 6976, op: HDFS_READ, > cliID: DFSClient_-1439725703, offset: 0, srvID: > DS-977430382-10.10.0.72-50010-1266601998020, blockid: > blk_8972126557191254374_1090962, duration: 85000 > ------ > > There are thousands of repeated reads of many small files like this. > > --- From NN log, this block was created > for /hbase/.META./1028785192/info/1728561479703335912 > 2010-04-06 21:51:20,906 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.allocateBlock: /hbase/.META./1028785192/info/1728561479703335912. > blk_8972126557191254374_1090962 > ---- > > Btw, we had single replication set for this file by mistake. > > thanks for taking a look. > Raghu. >