[
https://issues.apache.org/jira/browse/HBASE-28620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HBASE-28620:
-----------------------------------
Labels: pull-request-available (was: )
> replication quota leak when peer changes
> ----------------------------------------
>
> Key: HBASE-28620
> URL: https://issues.apache.org/jira/browse/HBASE-28620
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: MisterWang
> Priority: Critical
> Labels: pull-request-available
>
> When peer changes, replication closes the reader and shipper created earlier.
> However, after the specified timeout, the shipper still does not
> automatically close. The existing code simply returns without releasing
> quota. Not cleaning buffer usage.
> In one practice of my company, in this case, the quota was full because it
> was not released in time, so wal reader could not continue read new data and
> replication had a backlog.
>
> The log is as follows:
> 2024-05-20 20:00:00,796 WARN
> [RpcServer.default.FPRWQ.Fifo.read.handler=70,queue=1,port=16020]
> regionserver.ReplicationSourceShipper: Shipper clearWALEntryBatch method
> timed out whilst waiting reader/shipper thread to stop. Not cleaning buffer
> usage. Shipper alive: peer1; Reader alive: false
> 2024-05-20 20:00:01,351 WARN peer=peer1, can't read more edits from WAL as
> buffer usage 268435456B exceeds limit 268435456B
--
This message was sent by Atlassian Jira
(v8.20.10#820010)