Re: [PR] HBASE-29025: Enhance the full backup command to support Continuous Backup [hbase]

via GitHub Mon, 24 Feb 2025 13:31:18 -0800


anmolnar commented on code in PR #6710:
URL: https://github.com/apache/hbase/pull/6710#discussion_r1968419146



##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java:
##########
@@ -132,67 +147,14 @@ protected void snapshotCopy(BackupInfo backupInfo) throws 
Exception {
   @Override
   public void execute() throws IOException {
     try (Admin admin = conn.getAdmin()) {
-      // Begin BACKUP
       beginBackup(backupManager, backupInfo);
-      String savedStartCode;
-      boolean firstBackup;
-      // do snapshot for full table backup
-
-      savedStartCode = backupManager.readBackupStartCode();
-      firstBackup = savedStartCode == null || Long.parseLong(savedStartCode) 
== 0L;
-      if (firstBackup) {
-        // This is our first backup. Let's put some marker to system table so 
that we can hold the
-        // logs while we do the backup.
-        backupManager.writeBackupStartCode(0L);
-      }
-      // We roll log here before we do the snapshot. It is possible there is 
duplicate data
-      // in the log that is already in the snapshot. But if we do it after the 
snapshot, we
-      // could have data loss.
-      // A better approach is to do the roll log on each RS in the same global 
procedure as
-      // the snapshot.
-      LOG.info("Execute roll log procedure for full backup ...");
-
-      Map<String, String> props = new HashMap<>();
-      props.put("backupRoot", backupInfo.getBackupRootDir());
-      
admin.execProcedure(LogRollMasterProcedureManager.ROLLLOG_PROCEDURE_SIGNATURE,
-        LogRollMasterProcedureManager.ROLLLOG_PROCEDURE_NAME, props);
-
-      newTimestamps = backupManager.readRegionServerLastLogRollResult();
-
-      // SNAPSHOT_TABLES:
-      backupInfo.setPhase(BackupPhase.SNAPSHOT);
-      for (TableName tableName : tableList) {
-        String snapshotName = "snapshot_" + 
Long.toString(EnvironmentEdgeManager.currentTime())
-          + "_" + tableName.getNamespaceAsString() + "_" + 
tableName.getQualifierAsString();
-
-        snapshotTable(admin, tableName, snapshotName);
-        backupInfo.setSnapshotName(tableName, snapshotName);
+
+      if (backupInfo.isContinuousBackupEnabled()) {
+        handleContinuousBackup(admin);
+      } else {
+        handleNonContinuousBackup(admin);

Review Comment:
   If we introduced new backup type for continuous backup, you would be able to 
create a brand new class just for the continuous backup potententially sharing 
some code if needed.
   This if-else branch is a code smell I believe.



##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java:
##########
@@ -132,67 +147,14 @@ protected void snapshotCopy(BackupInfo backupInfo) throws 
Exception {
   @Override
   public void execute() throws IOException {
     try (Admin admin = conn.getAdmin()) {
-      // Begin BACKUP
       beginBackup(backupManager, backupInfo);
-      String savedStartCode;
-      boolean firstBackup;
-      // do snapshot for full table backup
-
-      savedStartCode = backupManager.readBackupStartCode();
-      firstBackup = savedStartCode == null || Long.parseLong(savedStartCode) 
== 0L;
-      if (firstBackup) {
-        // This is our first backup. Let's put some marker to system table so 
that we can hold the
-        // logs while we do the backup.
-        backupManager.writeBackupStartCode(0L);
-      }
-      // We roll log here before we do the snapshot. It is possible there is 
duplicate data
-      // in the log that is already in the snapshot. But if we do it after the 
snapshot, we
-      // could have data loss.
-      // A better approach is to do the roll log on each RS in the same global 
procedure as
-      // the snapshot.
-      LOG.info("Execute roll log procedure for full backup ...");
-
-      Map<String, String> props = new HashMap<>();
-      props.put("backupRoot", backupInfo.getBackupRootDir());
-      
admin.execProcedure(LogRollMasterProcedureManager.ROLLLOG_PROCEDURE_SIGNATURE,
-        LogRollMasterProcedureManager.ROLLLOG_PROCEDURE_NAME, props);
-
-      newTimestamps = backupManager.readRegionServerLastLogRollResult();
-
-      // SNAPSHOT_TABLES:
-      backupInfo.setPhase(BackupPhase.SNAPSHOT);
-      for (TableName tableName : tableList) {
-        String snapshotName = "snapshot_" + 
Long.toString(EnvironmentEdgeManager.currentTime())
-          + "_" + tableName.getNamespaceAsString() + "_" + 
tableName.getQualifierAsString();
-
-        snapshotTable(admin, tableName, snapshotName);
-        backupInfo.setSnapshotName(tableName, snapshotName);
+
+      if (backupInfo.isContinuousBackupEnabled()) {
+        handleContinuousBackup(admin);
+      } else {
+        handleNonContinuousBackup(admin);

Review Comment:
   Otherwise code refactoring looks good to me.



##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java:
##########
@@ -339,14 +341,24 @@ public void execute() throws IOException {
 
       boolean ignoreChecksum = cmdline.hasOption(OPTION_IGNORECHECKSUM);
 
+      BackupType backupType = BackupType.valueOf(args[1].toUpperCase());
+      List<TableName> tableNameList = null;
+      if (tables != null) {
+        tableNameList = 
Lists.newArrayList(BackupUtils.parseTableNames(tables));
+      }
+      boolean continuousBackup = 
cmdline.hasOption(OPTION_ENABLE_CONTINUOUS_BACKUP);
+      if (continuousBackup && !BackupType.FULL.equals(backupType)) {
+        System.out.println("ERROR: Continuous backup can Only be specified for 
Full Backup");
+        printUsage();
+        throw new IOException(INCORRECT_USAGE);
+      }

Review Comment:
   I'm just wondering, since continuous backup is meaningful only for full 
backups, what if we introduce a new backup type? Maybe we already discussed 
this, I can't remember.
   ```
   Usage: hbase backup create <type> <backup_path> [options]
     type           "full" to create a full backup image
                    "incremental" to create an incremental backup image
                    "continuous" to create new continuous backup
     backup_path     Full path to store the backup image
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-29025: Enhance the full backup command to support Continuous Backup [hbase]

Reply via email to