Did anyone get a chance to look at this and respond? I am pretty much stuck with these errors.
-Avani -----Original Message----- From: Sharma, Avani [mailto:[email protected]] Sent: Thursday, September 23, 2010 4:51 PM To: [email protected] Subject: DataXceiver problem slowing down random reads I am getting the below errors in my datanode and regionserver logs when doing random reads from HBase tables using Stargate. My HBase is 0.20.6. hadoop is 0.20.2. We have a 3 node cluster with master, namenode, jobtracker, tasktracker, datanode and regionserver on one machine and the other two machines are tasktracker, datanode and regionserver. The heap size for all 3 regionservers (only) is 4GB. In hdfs-site.xml, dfs.datanode.max.xcievers = 2048 dfs.datanode.socket.write.timeout = 0 (to avoid socket timeout errors. Is this needed with this version of HBase?) ulimit -n = 2048 I have gone through number of emails in the mailing list, but have not been able to resolve this issue at my end. Any help is appreciated. 2010-09-24 06:19:35,206 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block blk_517680959608157971_70974 file /ebay/hadoop/data/current/subd ir48/blk_517680959608157971 2010-09-24 06:19:35,239 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.110.210.11:50010, storageID=DS-1397488018-10.110.210.11 -50010-1281421533434, infoPort=50075, ipcPort=50020):Got exception while serving blk_-6923756801300423893_70891 to /10.110.210.13: java.io.IOException: Block blk_-6923756801300423893_70891 is not valid. at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722) at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95) at java.lang.Thread.run(Thread.java:619) 2010-09-24 06:19:35,239 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.110.210.11:50010, storageID=DS-1397488018-10.110.210.1 1-50010-1281421533434, infoPort=50075, ipcPort=50020):DataXceiver java.io.IOException: Block blk_-6923756801300423893_70891 is not valid. at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722) at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95) at java.lang.Thread.run(Thread.java:619) I restarted the regionserver. Before the restart, below is the error from shutting down: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/VRS/compaction.dir/945316184/5069182003336368746 File does not exist. [Lease. Holder: DFSClient_-511126949, pendingcreates: 1] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1332) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1323) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1251) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) After the region server restarted, I get the following: 2010-09-24 04:31:39,149 WARN org.apache.hadoop.hbase.regionserver.Store: Failed open of hdfs://tnsardev01.vip.ebay.com/hbase/VRS/47647742/data/1134106899916871771.2021370808; presumption is that fi le was corrupted at flush and lost edits picked up by commit log replay. Verify! java.io.IOException: Cannot open filename /hbase/VRS/2021370808/data/1134106899916871771 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1497) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1488) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:376) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) at org.apache.hadoop.hbase.io.hfile.HFile$Reader.<init>(HFile.java:731) at org.apache.hadoop.hbase.io.HalfHFileReader.<init>(HalfHFileReader.java:66) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:266) and also 2010-09-24 05:42:15,333 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: org.apache.hadoop.hbase.NotServingRegionException: and a number of 2010-09-24 06:35:32,452 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.storescan...@48cf41af 2010-09-24 06:35:45,481 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.storescan...@cfc7ecf 2010-09-24 06:35:57,357 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.storescan...@68fe0234 2010-09-24 06:36:10,807 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.storescan...@2829fd48 2010-09-24 06:36:25,665 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.storescan...@64755a16 2010-09-24 06:36:35,236 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.storescan...@7463874c 2010-09-24 06:36:41,557 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.storescan...@10f706e7
