[
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709387#comment-17709387
]
chenglei edited comment on HBASE-27778 at 4/7/23 8:44 AM:
----------------------------------------------------------
Pushed to 2.4+, thanks [~zhangduo] and [~Xiaolin Ha] for reviewing.
was (Author: comnetwork):
Pushed to 2.6+, thanks [~zhangduo] and [~Xiaolin Ha] for reviewing.
> Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication
> hang up
> ------------------------------------------------------------------------------------
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.4.17, 2.5.4
> Reporter: chenglei
> Assignee: chenglei
> Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.5, 2.4.18
>
>
> When we read a new WAL Entry in
> {{ReplicationSourceWALReader.readWALEntries}}, we add
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole
> {{WALEntryBatch}} may not be put to the
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg.
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this
> case. Because the {{ReplicationSourceWALReader.totalBufferUsed}} is
> actually scoped to {{ReplicationSourceManager}}, after a long run,
> replication to all peers may hang up.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)