[ 
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584046#comment-13584046
 ] 

Uma Maheswara Rao G commented on HBASE-7878:
--------------------------------------------

Currently recoverLease API will not gaurantee that lease recovery completely 
done in for that file. It will just initiate the recovery and returns false. 
recoverLease api returns true if that file already closed. Otherwise it always 
returns false.

Lease recovery steps:
 1) clinet requests for lease recovery
 2) NN will check the inode state. If it is not in underconstruction, then 
returns true as file already closed. Otherwise proceeds with recovery.
 3) If last block state is UNDERConstruction/UNDERRecovery, the intiated 
recovery. This is nothing but choosing a primary DN from the locations and add 
that block to that node's recoverBlocks queue.
 4) recover lease API will be retuned back with false result.
 5) Now NN will send this recover block details to primary DN as part of 
heartbeart response.
 6) Primary DN, recovery the blocks in all DNS and call the 
comitBlockSynchronization call to NN.
 7) NN will update the block with recovered genspamp in blocksmap and file 
inode will be finalized.

Here step 5,6,7 will happen in asynchronously. Here we have to be care that 
multiple recovery calls will go as we are calling in loop until it returns true.
But unfortunately we don't have such cases checked in Branch-1 and don't have 
such execption also I remember. Before NN giving the block to DN if new 
recovery request comes, that block will not be added for recovery anyway.
I am not sure about the dataloss, what is the exact scenario? But the current 
problem with lease recovery is, there is no way to ensure recovery is complete 
at DNs as well. If we simply call recover lease and proceed with the operations 
by assuming file might have recovered, can lead to a problem that file might be 
stil in recoevery inprogress stage. So, client may see the block with older 
genstamp and when it is trying to read it may get wrong length as the blocks 
were not recovered yet. This is the issue filed as HDFS-2296.

May be one option, what I am thinking is, How about HDFS expose an API like, 
fs.isFileClosed(src). If recovery successfully completed, file should have been 
closed. So, Hbase can loop on this API for some period?



                
> recoverFileLease does not check return value of recoverLease
> ------------------------------------------------------------
>
>                 Key: HBASE-7878
>                 URL: https://issues.apache.org/jira/browse/HBASE-7878
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>            Reporter: Eric Newton
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.6
>
>         Attachments: 7878-trunk-v1.txt, 7878-trunk-v2.txt, 7878-trunk-v3.txt
>
>
> I think this is a problem, so I'm opening a ticket so an HBase person takes a 
> look.
> Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease 
> recovery for Accumulo after HBase's lease recovery.  During testing, we 
> experienced data loss.  I found it is necessary to wait until recoverLease 
> returns true to know that the file has been truly closed.  In FSHDFSUtils, 
> the return result of recoverLease is not checked. In the unit tests created 
> to check lease recovery in HBASE-2645, the return result of recoverLease is 
> always checked.
> I think FSHDFSUtils should be modified to check the return result, and wait 
> until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to