[
https://issues.apache.org/jira/browse/HBASE-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17505897#comment-17505897
]
Hudson commented on HBASE-26323:
--------------------------------
Results for branch master
[build #534 on
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/534/]:
(x) *{color:red}-1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/534/General_20Nightly_20Build_20Report/]
(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/534/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]
(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/534/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> Introduce a SnapshotProcedure
> -----------------------------
>
> Key: HBASE-26323
> URL: https://issues.apache.org/jira/browse/HBASE-26323
> Project: HBase
> Issue Type: New Feature
> Components: proc-v2, snapshots
> Reporter: ruanhui
> Assignee: ruanhui
> Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-3
>
>
> Currently,snapshot in hbase uses zk as coordinator. It has some limitations,
> a. Snapshot maybe fails when there are region server crashes.
> b. Snapshot maybe failed when master restarts.
> c. Only one snapshot per table can be taken in a time.
> d. Snapshot verify will be handled by master, which may take long time when
> our table has a large number of regions, for example 10000.
>
> Since we have procedure v2 framework now, it is possible to solve the above
> problems. So here is a procedure2-based snapshot implementation. It has some
> goals,
> a. Snapshot can continue when there are region server crashes.
> b. Snapshot can continue when master restarts.
> c. More than one snapshot per table can be taken in a time.
> d. We can use region servers to verify snapshot to accelerate procedure.
>
> Here are some details about implementation.
> *SnapshotProcedure*
> SnapshotProcedure is used to take snapshot on a table. It acquires shared
> table lock on the snapshot table and hold the shared lock during suspend and
> yield.
> *SnapshotRegionProcedure*
> SnapshotRegionProcedure is used to take snapshot on a specific region of the
> snapshot table. It acquires exclusive region lock and releases lock during
> suspend and yield. Before dispatch remote snapshot operations to region
> server, it will check target region in RIT or not. If target region is in
> RIT, it will sleep some time and retry.
> *SnapshotVerifyProcedure*
> SnapshotVerifyProcedure is used to send snapshot verify request to region
> server. If snapshot is corrupted, it will notify parent snapshot to retry.
> When remote region server is crashed, it will choose another online server
> and retry.
>
> I would be very grateful for any advice and guidance. Is anyone interested in
> taking a look?
--
This message was sent by Atlassian Jira
(v8.20.1#820001)