[jira] [Comment Edited] (HBASE-28850) Only return from ReplicationSink.replicationEntries while all background tasks are finished

Andrew Kyle Purtell (Jira) Wed, 18 Sep 2024 11:06:00 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-28850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882776#comment-17882776
 ]


Andrew Kyle Purtell edited comment on HBASE-28850 at 9/18/24 6:04 PM:
----------------------------------------------------------------------

I am not saying this approach is wrong but it will change the performance of 
replicateEntries. Before, the replicateEntries call will return as soon as the 
local edits on the sink are _scheduled_ for application. After, the 
replicateEntries will block for the entire time it takes to confirm all local 
edits in the batch are applied, including waiting for retries and backoff if 
there is local server unavailability. And that may cause a reduction in 
replication throughput because of the new backpressure on the source, as we 
block in replicateEntries for what I assume will be a longer time on average. 

On the other hand, it can be a reasonable design decision to not return 
"success" to the source unless all edits in the batch are confirmed to be 
applied. 

It would make failure handling more reliable because we may miss local 
exceptions, and return "success" back to the source prematurely, if we are not 
waiting for all futures to complete.


was (Author: apurtell):
I am not saying this approach is wrong but it will change the performance of 
replicateEntries. Before, the replicateEntries call will return as soon as the 
local edits on the sink are _scheduled_ for application. After, the 
replicateEntries will block for the entire time it takes to confirm all local 
edits in the batch are applied, including waiting for retries and backoff if 
there is local server unavailability. And that may cause a reduction in 
replication throughput because of the new backpressure on the source, as we 
block in replicateEntries for what I assume will be a longer time on average. 

On the other hand, it can be a reasonable design decision to not return 
"success" to the source unless all edits in the batch are confirmed to be 
applied. It would make failure handling more reliable because we may miss 
exceptions and not fail the batch back to the source if we are not waiting for 
all futures to complete.

> Only return from ReplicationSink.replicationEntries while all background 
> tasks are finished
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-28850
>                 URL: https://issues.apache.org/jira/browse/HBASE-28850
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication, rpc
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>              Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-28850) Only return from ReplicationSink.replicationEntries while all background tasks are finished

Reply via email to