Hi As Laxman pointed out, there is a potential problem here. We expect the Namenode recovery to happen within a specified time and we tend to sleep for one second in the splitLogs logic. But we carry on with reading the HLog file which will result in failure. So if the logs are not split properly there could be a data loss.
Regards Ram -----Original Message----- From: Laxman [mailto:lakshman...@huawei.com] Sent: Tuesday, August 02, 2011 10:47 AM To: hdfs-dev@hadoop.apache.org; d...@hbase.apache.org Subject: FW: Handling read failures during recovery Partial mail was sent accidentally. Sorry for that. Resending with complete details, analysis and logs. 20-append version we are using. To summarize there are two problems [One each from HDFS and HBase] we noticed in this flow. 1) From HDFS Even though client is getting the updated block info from Namenode on first read failure, client is discarding the new info and using the old info only to retrieve the data from datanode. So, all the read retries are failing. [Method parameter reassignment - Not reflected in caller] HDFS Code snippet org.apache.hadoop.hdfs.DFSClient.DFSInputStream.chooseDataNode private DNAddrPair chooseDataNode(LocatedBlock block) throws IOException { ... ... block = getBlockAt(block.getStartOffset(), false); ... ... } Here method parameter "block" is assigned with the new block info which is not reflected in the caller "blockSeekTo(long target)". 2) From HBase Excerpt from my previous mail. > As the recovery is an asynchronous operation recoverLease call will return > immediately and may end up with read failure as the recovery is in progress. > > This may lead to some regions to be in offline state only > One approach is to introduce a delay in between recovery and read. But, this > may not be a fool proof way to address this. I've noticed the delay is already present in HBase code. But as I mentioned this may not be a fool proof mechanism to handle this scenario. HBase Code snippet In the class HLogSplitter the splitLog() calls recoverFileLease(). In recoverFileLease() try { Thread.sleep(1000); } catch (InterruptedException ex) { new InterruptedIOException().initCause(ex); } Once the recover call is made we sleep for one sec and proceed with parseHLog(). Here is the log 2011-07-21 17:01:19,642 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_1311262402613_3094 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... 2011-07-21 17:01:20,650 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_1311262402613_3094 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... 2011-07-21 17:01:21,669 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_1311262402613_3094 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... 2011-07-21 17:01:22,677 WARN org.apache.hadoop.hdfs.DFSClient: DFS Read: java.io.IOException: Could not obtain block: blk_1311262402613_3318 file=/hbase/.logs/158-1-101-222,20020,1311260346420/158-1-101-222%3A20020.13 11265398432 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.jav a:2491) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2 256) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2441) at java.io.DataInputStream.read(DataInputStream.java:132) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1984) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1884) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1930) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(Sequence FileLogReader.java:198) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(Sequence FileLogReader.java:172) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.parseHLog(HLogSplitter .java:429) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter. java:262) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter. java:188) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.ja va:201) -----Original Message----- From: Stack [mailto:saint....@gmail.com] Sent: Monday, August 01, 2011 9:03 PM To: d...@hbase.apache.org Cc: d...@hbase.apache.org Subject: Re: Handling read failures during recovery Which hdfs version and what is the error u see? Thanks. Stack On Aug 1, 2011, at 4:33, Laxman <lakshman...@huawei.com> wrote: > Hi Everyone, > > > > In HBase we try to recover the HLog file and then immediately proceed with > read operation. > > As the recovery is an asynchronous operation recoverLease call will return > immediately and may end up with read failure as the recovery is in progress. > > This may lead to some regions to be in offline state only. > > > > One approach is to introduce a delay in between recovery and read. But, this > may not be a fool proof way to address this. > > > > How do we handle this scenario? > > > > Please do correct me if my understanding went wrong. > > --Laxman >