Michael Stack created HBASE-24117:
-------------------------------------

             Summary: If move target RS crashes, move fails if concurrent 
master crash
                 Key: HBASE-24117
                 URL: https://issues.apache.org/jira/browse/HBASE-24117
             Project: HBase
          Issue Type: Bug
          Components: proc-v2
            Reporter: Michael Stack


I saw this on TestCloseRegionWithRSCrash. The Region 
788a516d1f86af98e0a16bcc1afe4fa1 was being moved to RS  
example.com,62652,1586032098445 just after it was killed. The Move Close fails 
because the RS has no node in the Master. The Move then tries to 'confirm' the 
close but it fails because no remote RS. We are then to wait in this state 
until operator or some other procedure intervenes to 'fix' the state. Normally 
a ServerCrashProcedure would do the job but in this test the Master is 
restarted after the RS is killed, a condition we do not accommodate.

Let me attach the test log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to