[ 
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586049#comment-13586049
 ] 

Eric Newton commented on HBASE-7878:
------------------------------------

Ted asked me to comment a bit further on the test that found this problem.

We (the accumulo team) use our Continuous Ingest test, along with a script that 
randomly kills services, to verify that accumulo doesn't lose data during 
recovery. We had ported this test to HBase in the past by using Gora.  We found 
an unrelated data loss a while back and posted it under HBASE-5754.  The 
details about the test can be found in the github project 
[Goraci|https://github.com/keith-turner/goraci].


In this case, we found the data loss in Accumulo and tracked it down to the 
lease recovery.  Since we used the same approach as HBase, I created this 
ticket.
                
> recoverFileLease does not check return value of recoverLease
> ------------------------------------------------------------
>
>                 Key: HBASE-7878
>                 URL: https://issues.apache.org/jira/browse/HBASE-7878
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>            Reporter: Eric Newton
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.6
>
>         Attachments: 7878-trunk-v2.txt, 7878-trunk-v3.txt
>
>
> I think this is a problem, so I'm opening a ticket so an HBase person takes a 
> look.
> Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease 
> recovery for Accumulo after HBase's lease recovery.  During testing, we 
> experienced data loss.  I found it is necessary to wait until recoverLease 
> returns true to know that the file has been truly closed.  In FSHDFSUtils, 
> the return result of recoverLease is not checked. In the unit tests created 
> to check lease recovery in HBASE-2645, the return result of recoverLease is 
> always checked.
> I think FSHDFSUtils should be modified to check the return result, and wait 
> until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to