Hi all,

We would like to propose merging the feature “Continuous Backup and
Point-in-Time Recovery (PITR)” into the main branch.
BackgroundExisting mechanisms such as replication and snapshots provide
data redundancy but are insufficient for effective point-in-time recovery.

   -

   *Replication* requires maintaining a live mirror cluster, which
   significantly increases operational costs.
   -

   *Snapshots* and *incremental snapshots* only capture data at discrete
   points in time, resulting in possible data loss between snapshots.

Limitations of the Current Incremental Backup Solution

The existing incremental backup framework in HBase exhibits several
limitations:

   -

   *Risk of data loss:* Incremental backups are batch-based, leading to
   potential data loss between backup intervals.
   -

   *Limited restore flexibility:* Recovery is restricted to specific backup
   timestamps rather than any desired point in time.
   -

   *WAL management overhead:* Write-Ahead Logs (WALs) cannot be archived
   until the backup operation completes, increasing storage overhead and
   complexity.
   -

   *Complex tracking:* Manual tracking of backup IDs, job history, and logs
   introduces operational challenges.

Summary of the Proposed Feature

The *Continuous Backup and PITR* feature introduces a continuous and
fine-grained backup mechanism that addresses the above limitations. It
enables:

   -

   Continuous archival of WALs to support near real-time backup.
   -

   Restoration of data to any desired point in time (PITR) for improved
   data protection and flexibility.
   -

   Simplified backup lifecycle and WAL management.

A detailed description of the design and implementation can be found in the
following document:
Design Document: Continuous Backup and Point-in-Time Recovery
<https://docs.google.com/document/d/1csQBMyM1mwpe4QpWkCbyqvsC9F5nUBr4ierOo8IuGpE/edit?pli=1&tab=t.0>

Please review and share your feedback or comments.

Best regards,
Vinayak Hegde

Reply via email to