[ 
https://issues.apache.org/jira/browse/HBASE-10000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843572#comment-13843572
 ] 

Ted Yu commented on HBASE-10000:
--------------------------------

Here is how patch v6 addresses the above scenario:
{code}
      if (leaseRecoveryReqTS == HConstants.LEASE_RECOVERY_UNREQUESTED || 
nbAttempt > 0) {
        startWaiting = EnvironmentEdgeManager.currentTimeMillis();
        if (recoverLease(dfs, nbAttempt, p, startWaiting)) return true;
      }
...
        if (nbAttempt == 0 && leaseRecoveryReqTS != 
HConstants.LEASE_RECOVERY_UNREQUESTED) {
          firstPause -= (EnvironmentEdgeManager.currentTimeMillis() - 
leaseRecoveryReqTS);
        }
        if (nbAttempt == 0 && isFileClosedMeth == null) {
          if (firstPause > 0) Thread.sleep(firstPause);
          else continue;
        } else {
{code}
If the master initiated the recovery more than 4 seconds ago AND there is not 
isFileClosed on the region server, firstPause would be negative. In that case 
the code continues with iteration #2 and starts lease recovery - keeping the 
previous behavior.

I am trying to come up with a test for this scenario where I plan to lift 
startWaiting as an instance variable so that the test can query and verify that 
we don't wait 1 extra minute.
Does this sound good ?

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-10000
>                 URL: https://issues.apache.org/jira/browse/HBASE-10000
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 0.98.1
>
>         Attachments: 10000-0.96-v5.txt, 10000-0.96-v6.txt, 
> 10000-recover-ts-with-pb-2.txt, 10000-recover-ts-with-pb-3.txt, 
> 10000-recover-ts-with-pb-4.txt, 10000-recover-ts-with-pb-5.txt, 
> 10000-recover-ts-with-pb-6.txt, 10000-v4.txt, 10000-v5.txt, 10000-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to