[ 
https://issues.apache.org/jira/browse/HBASE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673408#comment-13673408
 ] 

Jeffrey Zhong commented on HBASE-8666:
--------------------------------------

{quote}
In what scenarios would previouslyFailedServers not suffice alone? Will 
previouslyFailedMetaRSs not be a subset of previouslyFailedServers.
{quote}
When I run tests and killed RS and Master in random order, I end up with 
previouslyFailedMetaRSs isn't part of previouslyFailedServers. The end result 
is bad because META can't be out of recovering state. So comes the v3 patch 
which can make sure .META. will be out of recovering state even data integrity 
has broken before master starts up.

{quote}
Something similar in removeRecoveringRegionsFromZK() too?
{quote}
The reason to initialize the removeRecoveringRegionsFromZK to 0 is to let 
recovering region GC run once after master is initialized to remove possible 
stale recovering regions. The call will be trigged inside 
TimeoutMonitor#removeRecoveringRegionsFromZK(null, null);. This change is a 
nice to have one.

Thanks.



 
                
> META region isn't fully recovered during master initialization when META 
> region recovery had chained failures
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8666
>                 URL: https://issues.apache.org/jira/browse/HBASE-8666
>             Project: HBase
>          Issue Type: Bug
>          Components: MTTR
>            Reporter: Jeffrey Zhong
>            Assignee: Jeffrey Zhong
>             Fix For: 0.98.0, 0.95.2
>
>         Attachments: hbase-8666.patch, hbase-8666-v2.patch, 
> hbase-8666-v3.patch
>
>
> In distributedLogReplay mode when Meta recovery had experienced chained 
> failures(recovery failed multiple times in a row), META region can't be fully 
> recovered during master starts up.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to