I am getting the below errors in my datanode and regionserver logs when doing
random reads from HBase tables using Stargate.
My HBase is 0.20.6. hadoop is 0.20.2.
We have a 3 node cluster with master, namenode, jobtracker, tasktracker,
datanode and regionserver on one machine and the other two machines are
tasktracker, datanode and regionserver.
The heap size for all 3 regionservers (only) is 4GB.
In hdfs-site.xml, dfs.datanode.max.xcievers = 2048
dfs.datanode.socket.write.timeout = 0 (to avoid socket timeout errors. Is this
needed with this version of HBase?)
ulimit -n = 2048
I have gone through number of emails in the mailing list, but have not been
able to resolve this issue at my end. Any help is appreciated.
2010-09-24 06:19:35,206 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Deleting block blk_517680959608157971_70974 file /ebay/hadoop/data/current/subd
ir48/blk_517680959608157971
2010-09-24 06:19:35,239 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.110.210.11:50010, storageID=DS-1397488018-10.110.210.11
-50010-1281421533434, infoPort=50075, ipcPort=50020):Got exception while
serving blk_-6923756801300423893_70891 to /10.110.210.13:
java.io.IOException: Block blk_-6923756801300423893_70891 is not valid.
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
at java.lang.Thread.run(Thread.java:619)
2010-09-24 06:19:35,239 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.110.210.11:50010, storageID=DS-1397488018-10.110.210.1
1-50010-1281421533434, infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-6923756801300423893_70891 is not valid.
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
at java.lang.Thread.run(Thread.java:619)
I restarted the regionserver. Before the restart, below is the error from
shutting down:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
/hbase/VRS/compaction.dir/945316184/5069182003336368746 File does not exist.
[Lease.
Holder: DFSClient_-511126949, pendingcreates: 1]
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1332)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1323)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1251)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
After the region server restarted, I get the following:
2010-09-24 04:31:39,149 WARN org.apache.hadoop.hbase.regionserver.Store: Failed
open of
hdfs://tnsardev01.vip.ebay.com/hbase/VRS/47647742/data/1134106899916871771.2021370808;
presumption is that fi
le was corrupted at flush and lost edits picked up by commit log replay. Verify!
java.io.IOException: Cannot open filename
/hbase/VRS/2021370808/data/1134106899916871771
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1497)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1488)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:376)
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader.<init>(HFile.java:731)
at
org.apache.hadoop.hbase.io.HalfHFileReader.<init>(HalfHFileReader.java:66)
at
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:266)
and also
2010-09-24 05:42:15,333 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.NotServingRegionException:
and a number of
2010-09-24 06:35:32,452 WARN org.apache.hadoop.hbase.regionserver.Store: Not in
setorg.apache.hadoop.hbase.regionserver.storescan...@48cf41af
2010-09-24 06:35:45,481 WARN org.apache.hadoop.hbase.regionserver.Store: Not in
setorg.apache.hadoop.hbase.regionserver.storescan...@cfc7ecf
2010-09-24 06:35:57,357 WARN org.apache.hadoop.hbase.regionserver.Store: Not in
setorg.apache.hadoop.hbase.regionserver.storescan...@68fe0234
2010-09-24 06:36:10,807 WARN org.apache.hadoop.hbase.regionserver.Store: Not in
setorg.apache.hadoop.hbase.regionserver.storescan...@2829fd48
2010-09-24 06:36:25,665 WARN org.apache.hadoop.hbase.regionserver.Store: Not in
setorg.apache.hadoop.hbase.regionserver.storescan...@64755a16
2010-09-24 06:36:35,236 WARN org.apache.hadoop.hbase.regionserver.Store: Not in
setorg.apache.hadoop.hbase.regionserver.storescan...@7463874c
2010-09-24 06:36:41,557 WARN org.apache.hadoop.hbase.regionserver.Store: Not in
setorg.apache.hadoop.hbase.regionserver.storescan...@10f706e7