[ 
https://issues.apache.org/jira/browse/HBASE-26768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-26768:
-----------------------------
    Description: 
It seems that the problem HBASE-26449 described still exists, just  as 
following {{RegionReplicationSink.onComplete}},
which is running in Netty's nioEventLoop. Assuming we have only one secondary 
replica, first we add the replica to the 
{{RegionReplicationSink.failedReplicas}} because of a failure of replicating in 
following line 228, but before we enter the line 238, the {{Flusher}} calls 
{{RegionReplicationSink.add}} and we clear the 
{{RegionReplicationSink.failedReplicas}} due to a flush all edit in 
{{RegionReplicationSink.add}}. When the Netty nioEventLoop continues to enter 
line 238, we still add a replica to the failedReplicas even though the 
{{maxSequenceId < lastFlushedSequenceId}}.
What is worse, when we invoke 
{{RegionReplicationFlushRequester.requestFlush(long)}}


{code:java}
207 private void onComplete(List<SinkEntry> sent,
208         Map<Integer, MutableObject<Throwable>> replica2Error) {
                      ....
217        Set<Integer> failed = new HashSet<>();
218        for (Map.Entry<Integer, MutableObject<Throwable>> entry : 
replica2Error.entrySet()) {
219        Integer replicaId = entry.getKey();
220       Throwable error = entry.getValue().getValue();
221        if (error != null) {
222           if (maxSequenceId > lastFlushedSequenceId) {
                     ...
228             failed.add(replicaId);
229           } else {
                   ......

238       synchronized (entries) {
239           pendingSize -= toReleaseSize;
240           if (!failed.isEmpty()) {
241                failedReplicas.addAll(failed);
242                flushRequester.requestFlush(maxSequenceId);
243           }
                ......
253      }
254    }
{code}

What is worse, when we invoke {{RegionReplicationFlushRequester.requestFlush}}, 
the flush may be skipped because
in following {{RegionReplicationFlushRequester.flush}}, 
{{pendingFlushRequestSequenceId}} is less than {{lastFlushedSequenceId}}, so 
the only secondary replica is marked failed and requested flush is skipped, the 
replication may suspend until next memstore flush :

{code:java}
private synchronized void flush(Timeout timeout) {
    pendingFlushRequest = null;
    if (pendingFlushRequestSequenceId >= lastFlushedSequenceId) {
      request();
    }
  }
{code}

I simulate this problem in the PR and my fix is double check the  {{if 
(maxSequenceId > lastFlushedSequenceId)}} in the {{synchronized}} block in 
{{RegionReplicationSink.onComplete}}.



  was:
It seems that the problem HBASE-26449 described still exists, just  as 
following {{RegionReplicationSink.onComplete}},
which is running in Netty's nioEventLoop. Assuming we have only one secondary 
replica, first we add the replica to the {{RegionReplicationSink.failedReplicas 
} because of a failure of replicating in following line 228, but before we 
enter the line 238, the {{Flusher}} calls {{RegionReplicationSink.add}} and we 
clear the {{RegionReplicationSink.failedReplicas}} due to a flush all edit in 
{{RegionReplicationSink.add}}. When the Netty nioEventLoop continues to enter 
line 238, we still add a replica to the failedReplicas even though the 
{{maxSequenceId < lastFlushedSequenceId}}.
What is worse, when we invoke 
{{RegionReplicationFlushRequester.requestFlush(long)}}


{code:java}
207 private void onComplete(List<SinkEntry> sent,
208         Map<Integer, MutableObject<Throwable>> replica2Error) {
                      ....
217        Set<Integer> failed = new HashSet<>();
218        for (Map.Entry<Integer, MutableObject<Throwable>> entry : 
replica2Error.entrySet()) {
219        Integer replicaId = entry.getKey();
220       Throwable error = entry.getValue().getValue();
221        if (error != null) {
222           if (maxSequenceId > lastFlushedSequenceId) {
                     ...
228             failed.add(replicaId);
229           } else {
                   ......

238       synchronized (entries) {
239           pendingSize -= toReleaseSize;
240           if (!failed.isEmpty()) {
241                failedReplicas.addAll(failed);
242                flushRequester.requestFlush(maxSequenceId);
243           }
                ......
253      }
254    }
{code}

What is worse, when we invoke {{RegionReplicationFlushRequester.requestFlush}}, 
the flush may be skipped because
in following {{RegionReplicationFlushRequester.flush}}, 
{{pendingFlushRequestSequenceId}} is less than {{lastFlushedSequenceId}}, so 
the only secondary replica is marked failed and requested flush is skipped, the 
replication may suspend until next memstore flush :
{{code:java}}
private synchronized void flush(Timeout timeout) {
    pendingFlushRequest = null;
    if (pendingFlushRequestSequenceId >= lastFlushedSequenceId) {
      request();
    }
  }
{{code}}




> Avoid unnecessary replication suspending in RegionReplicationSink
> -----------------------------------------------------------------
>
>                 Key: HBASE-26768
>                 URL: https://issues.apache.org/jira/browse/HBASE-26768
>             Project: HBase
>          Issue Type: Bug
>          Components: read replicas
>    Affects Versions: 3.0.0-alpha-2
>            Reporter: chenglei
>            Priority: Major
>
> It seems that the problem HBASE-26449 described still exists, just  as 
> following {{RegionReplicationSink.onComplete}},
> which is running in Netty's nioEventLoop. Assuming we have only one secondary 
> replica, first we add the replica to the 
> {{RegionReplicationSink.failedReplicas}} because of a failure of replicating 
> in following line 228, but before we enter the line 238, the {{Flusher}} 
> calls {{RegionReplicationSink.add}} and we clear the 
> {{RegionReplicationSink.failedReplicas}} due to a flush all edit in 
> {{RegionReplicationSink.add}}. When the Netty nioEventLoop continues to enter 
> line 238, we still add a replica to the failedReplicas even though the 
> {{maxSequenceId < lastFlushedSequenceId}}.
> What is worse, when we invoke 
> {{RegionReplicationFlushRequester.requestFlush(long)}}
> {code:java}
> 207 private void onComplete(List<SinkEntry> sent,
> 208         Map<Integer, MutableObject<Throwable>> replica2Error) {
>                       ....
> 217        Set<Integer> failed = new HashSet<>();
> 218        for (Map.Entry<Integer, MutableObject<Throwable>> entry : 
> replica2Error.entrySet()) {
> 219        Integer replicaId = entry.getKey();
> 220       Throwable error = entry.getValue().getValue();
> 221        if (error != null) {
> 222           if (maxSequenceId > lastFlushedSequenceId) {
>                      ...
> 228             failed.add(replicaId);
> 229           } else {
>                    ......
> 238       synchronized (entries) {
> 239           pendingSize -= toReleaseSize;
> 240           if (!failed.isEmpty()) {
> 241                failedReplicas.addAll(failed);
> 242                flushRequester.requestFlush(maxSequenceId);
> 243           }
>                 ......
> 253      }
> 254    }
> {code}
> What is worse, when we invoke 
> {{RegionReplicationFlushRequester.requestFlush}}, the flush may be skipped 
> because
> in following {{RegionReplicationFlushRequester.flush}}, 
> {{pendingFlushRequestSequenceId}} is less than {{lastFlushedSequenceId}}, so 
> the only secondary replica is marked failed and requested flush is skipped, 
> the replication may suspend until next memstore flush :
> {code:java}
> private synchronized void flush(Timeout timeout) {
>     pendingFlushRequest = null;
>     if (pendingFlushRequestSequenceId >= lastFlushedSequenceId) {
>       request();
>     }
>   }
> {code}
> I simulate this problem in the PR and my fix is double check the  {{if 
> (maxSequenceId > lastFlushedSequenceId)}} in the {{synchronized}} block in 
> {{RegionReplicationSink.onComplete}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to