[
https://issues.apache.org/jira/browse/HBASE-28987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911458#comment-17911458
]
Vinayak Hegde commented on HBASE-28987:
---------------------------------------
*Replication Endpoint Design*
*Overview*
We aim to create a {{{}StorageReplicationEndpoint{}}}, which will be configured
through the standard replication peer setup. This setup involves specifying a
storage location and other essential parameters to ensure proper authorization
with the storage system. The endpoint will write data to S3 while leveraging
the replication framework to track which Write-Ahead Logs (WALs) have been read
and up to what offset.{*}Current Replication Setup{*}
In the current replication framework: # The replication source reads WAL files
from the RegionServer and pushes batches of entries to the
{{ReplicationEndpoint}} via the {{replicate()}} method.
# The {{replicate()}} method processes these entries, performs the backup (or
replication), and responds with success or failure.
# On success, the replication source advances the WAL file offset and never
re-sends the processed entries, moving on to new batches.
# On failure, the same entries are retried until a success response is
received.
*Challenge with Direct WAL Writing to S3*
Directly writing WAL files to S3 presents a challenge. * The
{{S3ABlockOutputStream}} does not implement {{{}StreamCapabilities{}}}, meaning
it lacks support for {{hflush()}} and {{{}hsync(){}}}.
* The only reliable way to confirm that data is persisted to S3 is by calling
{{{}close(){}}}, which guarantees persistence but results in the creation of
many small files.
*Proposed Solutions*
To address the challenges, we propose two approaches:
*Option 1: Use a Staging Area (HDFS) and Copy to S3*
*Approach:* * Write WAL entries to a staging area (HDFS) that supports
{{hflush()}} and {{{}hsync(){}}}.
* Confirm the backup as soon as the WAL entries are written to HDFS.
* Close and reopen a new WAL file when it reaches a threshold size (e.g., 128
MB), or use a flush thread that closes the writer periodically if it contains
data.
* Use separate background threads to copy the WAL files from the staging area
to S3.
*Pros:* * Writing to HDFS is faster than directly writing to S3.
* Background threads can handle the copy operation, reducing the impact on
real-time processing.
*Cons:* * WAL files will accumulate in the staging area if there are issues
with S3.
* In case of a disaster (e.g., loss of HDFS), staged but unbacked data may be
lost.
*Option 2: Simultaneously Write to Staging Area and S3*
*Approach:* * Open two WAL writers simultaneously: one for the staging area
(HDFS) and one for S3.
* In the {{replicate()}} method, write entries to both the staging area and S3.
* Confirm the backup only when both writes succeed.
* Close the S3 WAL writer when the file reaches the size threshold or when a
flush thread closes it, and delete the corresponding file from the staging area.
* In case of a server restart or failure, the file in the staging area can be
copied to S3 to prevent data loss.
*Pros:* * Data is written directly to the backup location (S3), reducing the
risk of losing significant amounts of data if HDFS fails.
*Cons:* * Writing to S3 is slower compared to writing to HDFS.
> Developing a Custom ReplicationEndpoint to Support External Storage
> Integration
> -------------------------------------------------------------------------------
>
> Key: HBASE-28987
> URL: https://issues.apache.org/jira/browse/HBASE-28987
> Project: HBase
> Issue Type: Task
> Components: backup&restore
> Affects Versions: 2.6.0, 3.0.0-alpha-4
> Reporter: Vinayak Hegde
> Assignee: Vinayak Hegde
> Priority: Major
>
> *Develop a Custom Replication Endpoint*
> Implement a custom replication endpoint to support the backup of WALs to
> external storage systems, such as HDFS-compliant storages (including HDFS,
> S3, ADLS, and GCS via respective Hadoop connectors).
> *Support for Bulk-loaded Files*
> Add functionality to back up bulk-loaded files in addition to regular WALs.
> *Ensure Process Durability*
> Ensure the backup process is durable, with no WALs being missed, even in the
> event of issues in the cluster.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)