[
https://issues.apache.org/jira/browse/HBASE-22631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905053#comment-16905053
]
Wellington Chevreuil commented on HBASE-22631:
----------------------------------------------
Thanks for the extra details, [~yu-huiyang]. So the issue seems to be that this
region was supposedly deleted already, as we could see the _GCRegionProcedure_
has completed *success* for that region, but then it comes back when a given RS
crashes and an SCP is submitted to it. Is it possible to share this Master log
file, together with the tempt20 RS log file, covering this period between
"2019-06-26 10:40" and "2019-06-26 10:42"? I suspect this might be same issue
as the one addressed in HBASE-21843, where RS holding meta crashes, then meta
wal edits are skipped in the wal replay, which misses some recent updates to
meta.
> assign failed may make gced parent region appear again !!!
> ----------------------------------------------------------
>
> Key: HBASE-22631
> URL: https://issues.apache.org/jira/browse/HBASE-22631
> Project: HBase
> Issue Type: Bug
> Components: proc-v2
> Affects Versions: 2.1.1
> Reporter: yuhuiyang
> Priority: Major
> Attachments: HBASE-22631-branch-2.1-01.patch, assign.png,
> assignProcedure.txt, serverCrash.png, splitAndGc.png
>
>
> When i assign a region A the process is as follows:
> step1 : A is assigned to rs1 , and rs1 fails to open it .
> step2 : assignProcedure handleFailure .
> step3 : A is assign to rs2 and rs success to open it .
> Above is the normal flow . However when rs1 is restart after the reigon A was
> split and GCRegionProcedure was successed , the region A appare again !
> The region is that reigon A is not removed from the serverMap correctly when
> assignprocedure handleFailure . Because the code regionNode.offline() make
> the regionNode's regionLocation to be null and make regionNode's state to
> OFFLINE . So when the code
> env.getAssignmentManager().undoRegionAsOpening(regionNode) do nothing . So
> when the rs1 restart event triggers a serverCrashProcedure, it will get
> reigons from serverMap and it will get the region A then A will be assigned
> and hdfs dir will be created.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)