[ 
https://issues.apache.org/jira/browse/HBASE-10000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845114#comment-13845114
 ] 

Ted Yu commented on HBASE-10000:
--------------------------------

The test failure (taking place around 2013-12-11 05:11:49) was not related to 
patch.
For testTaskResigned() :
{code}
    int version = ZKUtil.checkExists(zkw, tasknode);
    // Could be small race here.
    if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, 
to/2);
{code}
There was no log similar to the following (corresponding to waitForCounter() 
call above):
{code}
2013-12-10 21:23:54,905 INFO  [main] hbase.Waiter(174): Waiting up to [3,200] 
milli-secs(wait.for.ratio=[1])
{code}
Meaning, the version (2) retrieved corresponded to resubmitted task. version1 
retrieved same value, leading to assertion failure.

I placed breakpoints at the beginning of splitLogDistributed() and 
recoverDFSFileLease() - none of them got hit.

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-10000
>                 URL: https://issues.apache.org/jira/browse/HBASE-10000
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 0.98.1
>
>         Attachments: 10000-0.96-v5.txt, 10000-0.96-v6.txt, 
> 10000-recover-ts-with-pb-2.txt, 10000-recover-ts-with-pb-3.txt, 
> 10000-recover-ts-with-pb-4.txt, 10000-recover-ts-with-pb-5.txt, 
> 10000-recover-ts-with-pb-6.txt, 10000-recover-ts-with-pb-7.txt, 
> 10000-recover-ts-with-pb-7.txt, 10000-recover-ts-with-pb-8.txt, 10000-v4.txt, 
> 10000-v5.txt, 10000-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to