[ 
https://issues.apache.org/jira/browse/HBASE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088121#comment-16088121
 ] 

stack commented on HBASE-18366:
-------------------------------

bq. Initially all RS are at the same version i.e. 3.0.0-SNAPSHOT. 
HMaster.getRegionServerVersion() returns version 0.0.0 for dead RS (carrying 
meta)....MoveRegionProcedure to move meta region from RS with version 0.0.0 to 
one of other RS with latest version.

This is good. We have double the procedures working on reassign now.

bq. I found that server can be online and dead at the same time!

Good one [~uagashe] This is a 'hole'.

On the change, it looks good to me. I wonder though how something went into the 
serverdead list w/o being pulled from the online list. That seems like a 
backdoor we want to close.

I can disable for now but will not resolve this issue. I like pulling the 
checkIfShouldMoveSystemRegionAsync logic handling into your new procedure, 
HBASE-18261. Would be good to figure why addition to dead list does not get a 
server purged from the online list? Because it has not been processed yet by 
crash procedure?  How did it get into dead list then?

Thanks [~uagashe]

> Fix flaky test 
> hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-18366
>                 URL: https://issues.apache.org/jira/browse/HBASE-18366
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Umesh Agashe
>            Assignee: Umesh Agashe
>
> It worked for a few days after enabling it with HBASE-18278. But started 
> failing after commits:
> 6786b2b
> 68436c9
> 75d2eca
> 50bb045
> df93c13
> It works with one commit before: c5abb6c. Need to see what changed with those 
> commits.
> Currently it fails with TableNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to