[ 
https://issues.apache.org/jira/browse/HBASE-28987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911458#comment-17911458
 ] 

Vinayak Hegde commented on HBASE-28987:
---------------------------------------

*Replication Endpoint Design*
*Overview*
We aim to create a {{{}StorageReplicationEndpoint{}}}, which will be configured 
through the standard replication peer setup. This setup involves specifying a 
storage location and other essential parameters to ensure proper authorization 
with the storage system. The endpoint will write data to S3 while leveraging 
the replication framework to track which Write-Ahead Logs (WALs) have been read 
and up to what offset.{*}Current Replication Setup{*}
In the current replication framework: # The replication source reads WAL files 
from the RegionServer and pushes batches of entries to the 
{{ReplicationEndpoint}} via the {{replicate()}} method.
 # The {{replicate()}} method processes these entries, performs the backup (or 
replication), and responds with success or failure.
 # On success, the replication source advances the WAL file offset and never 
re-sends the processed entries, moving on to new batches.
 # On failure, the same entries are retried until a success response is 
received.

*Challenge with Direct WAL Writing to S3*
Directly writing WAL files to S3 presents a challenge. * The 
{{S3ABlockOutputStream}} does not implement {{{}StreamCapabilities{}}}, meaning 
it lacks support for {{hflush()}} and {{{}hsync(){}}}.
 * The only reliable way to confirm that data is persisted to S3 is by calling 
{{{}close(){}}}, which guarantees persistence but results in the creation of 
many small files.

*Proposed Solutions*
To address the challenges, we propose two approaches:
*Option 1: Use a Staging Area (HDFS) and Copy to S3*
*Approach:* * Write WAL entries to a staging area (HDFS) that supports 
{{hflush()}} and {{{}hsync(){}}}.
 * Confirm the backup as soon as the WAL entries are written to HDFS.
 * Close and reopen a new WAL file when it reaches a threshold size (e.g., 128 
MB), or use a flush thread that closes the writer periodically if it contains 
data.
 * Use separate background threads to copy the WAL files from the staging area 
to S3.

*Pros:* * Writing to HDFS is faster than directly writing to S3.
 * Background threads can handle the copy operation, reducing the impact on 
real-time processing.

*Cons:* * WAL files will accumulate in the staging area if there are issues 
with S3.
 * In case of a disaster (e.g., loss of HDFS), staged but unbacked data may be 
lost.

*Option 2: Simultaneously Write to Staging Area and S3*
*Approach:* * Open two WAL writers simultaneously: one for the staging area 
(HDFS) and one for S3.
 * In the {{replicate()}} method, write entries to both the staging area and S3.
 * Confirm the backup only when both writes succeed.
 * Close the S3 WAL writer when the file reaches the size threshold or when a 
flush thread closes it, and delete the corresponding file from the staging area.
 * In case of a server restart or failure, the file in the staging area can be 
copied to S3 to prevent data loss.

*Pros:* * Data is written directly to the backup location (S3), reducing the 
risk of losing significant amounts of data if HDFS fails.

*Cons:* * Writing to S3 is slower compared to writing to HDFS.

> Developing a Custom ReplicationEndpoint to Support External Storage 
> Integration
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-28987
>                 URL: https://issues.apache.org/jira/browse/HBASE-28987
>             Project: HBase
>          Issue Type: Task
>          Components: backup&restore
>    Affects Versions: 2.6.0, 3.0.0-alpha-4
>            Reporter: Vinayak Hegde
>            Assignee: Vinayak Hegde
>            Priority: Major
>
> *Develop a Custom Replication Endpoint*
> Implement a custom replication endpoint to support the backup of WALs to 
> external storage systems, such as HDFS-compliant storages (including HDFS, 
> S3, ADLS, and GCS via respective Hadoop connectors).
> *Support for Bulk-loaded Files*
> Add functionality to back up bulk-loaded files in addition to regular WALs.
> *Ensure Process Durability*
> Ensure the backup process is durable, with no WALs being missed, even in the 
> event of issues in the cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to