[
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allan Yang updated HBASE-20878:
-------------------------------
Description:
In MergeTableRegionsProcedure, we close the regions to merge using
UnassignProcedure. But, if the RS these regions on is crashed, a
ServerCrashProcedure will execute at the same time. UnassignProcedures will be
blockd until all logs are split. But since these regions are closed for
merging, the regions won't open again, the recovered.edit in the region dir
won't be replay, thus, data will loss.
I provided a test to repo this case. I seriously doubt Split region procedure
also has this kind of problem. I will check later
was:
In MergeTableRegionsProcedure, we close the regions to merge using
UnassignProcedure. But, if the RS these regions on is crashed, a
ServerCrashProcedure will execute at the same time. UnassignProcedures will be
blocks until all logs are split. But since these regions are closed for
merging, the regions won't open again, the recovered.edit in the region dir
won't be replay, thus, data will loss.
I provided a test to repo this case. I seriously doubt Split region procedure
also has this kind of problem. I will check later
> Data loss if merging regions while ServerCrashProcedure executing
> -----------------------------------------------------------------
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
> Issue Type: Bug
> Components: amv2
> Affects Versions: 3.0.0, 2.1.0, 2.0.1
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Critical
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using
> UnassignProcedure. But, if the RS these regions on is crashed, a
> ServerCrashProcedure will execute at the same time. UnassignProcedures will
> be blockd until all logs are split. But since these regions are closed for
> merging, the regions won't open again, the recovered.edit in the region dir
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure
> also has this kind of problem. I will check later
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)