[
https://issues.apache.org/jira/browse/HBASE-24117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076839#comment-17076839
]
Michael Stack commented on HBASE-24117:
---------------------------------------
[~zhangduo] Dang. Yeah. Mistake. Let me repro and post proper log. Sorry about
that.
> If move target RS crashes, move fails if concurrent master crash
> ----------------------------------------------------------------
>
> Key: HBASE-24117
> URL: https://issues.apache.org/jira/browse/HBASE-24117
> Project: HBase
> Issue Type: Bug
> Components: proc-v2
> Reporter: Michael Stack
> Assignee: Michael Stack
> Priority: Major
> Fix For: 3.0.0, 2.3.0
>
> Attachments:
> org.apache.hadoop.hbase.master.TestMasterShutdown-output.txt
>
>
> I saw this on TestCloseRegionWithRSCrash. The Region
> 788a516d1f86af98e0a16bcc1afe4fa1 was being moved to RS
> example.com,62652,1586032098445 just after it was killed. The Move Close
> fails because the RS has no node in the Master. The Move then tries to
> 'confirm' the close but it fails because no remote RS. We are then to wait in
> this state until operator or some other procedure intervenes to 'fix' the
> state. Normally a ServerCrashProcedure would do the job but in this test the
> Master is restarted after the RS is killed, a condition we do not accommodate.
> Let me attach the test log.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)