[
https://issues.apache.org/jira/browse/HBASE-24781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
leizhang updated HBASE-24781:
-----------------------------
Description:
Supposed that we have an peer with id 1, when execute shell cmd disable_peer
'1' and enable_peer '1', then i can see the SizeOfLogQueue metric of all
regionservers +1 , after 10 times disable_peer ops , it will increase to
11, and it will never decrease to 1 in fulture .
I can see the function ReplicationSourceManager.refreshSources(peerId) is
called , it will terminate the previous replication source and create a new
one. and found the note //Do not clear metrics in the bellow code block:
{code:java}
ReplicationSourceInterface toRemove = this.sources.put(peerId, src);
if (toRemove != null) {
LOG.info("Terminate replication source for " + toRemove.getPeerId());
// Do not clear metrics
toRemove.terminate(terminateMessage, null, false);
}
{code}
this cause the wrong number of sizeOfLogQueue, i think it's a sub issue of
(HBASE-23231)
was:
Supposed that we have an peer with id 1, when execute shell cmd disable_peer
'1' and enable_peer '1', then i can see the SizeOfLogQueue metric of all
regionservers +1 , after 10 times disable_peer ops , it will increase to
11, and it will never decrease to 1 in fulture .
I can see the function ReplicationSourceManager.refreshSources(peerId) is
called , it will terminate the previous replication source and create a new
one. and i found the note //Do not clear metrics in the bellow code block:
{code:java}
ReplicationSourceInterface toRemove = this.sources.put(peerId, src);
if (toRemove != null) {
LOG.info("Terminate replication source for " + toRemove.getPeerId());
// Do not clear metrics
toRemove.terminate(terminateMessage, null, false);
}
{code}
this cause the wrong number of sizeOfLogQueue, i think it's a sub issue of
(HBASE-23231)
> [Replication] When execute shell cmd "disable_peer peerId",the master web UI
> show a wrong number of SizeOfLogQueue
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-24781
> URL: https://issues.apache.org/jira/browse/HBASE-24781
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Affects Versions: 2.2.5
> Reporter: leizhang
> Priority: Major
>
> Supposed that we have an peer with id 1, when execute shell cmd
> disable_peer '1' and enable_peer '1', then i can see the SizeOfLogQueue
> metric of all regionservers +1 , after 10 times disable_peer ops , it
> will increase to 11, and it will never decrease to 1 in fulture .
> I can see the function ReplicationSourceManager.refreshSources(peerId) is
> called , it will terminate the previous replication source and create a new
> one. and found the note //Do not clear metrics in the bellow code block:
> {code:java}
> ReplicationSourceInterface toRemove = this.sources.put(peerId, src);
> if (toRemove != null) {
> LOG.info("Terminate replication source for " + toRemove.getPeerId());
> // Do not clear metrics
> toRemove.terminate(terminateMessage, null, false);
> }
> {code}
> this cause the wrong number of sizeOfLogQueue, i think it's a sub issue of
> (HBASE-23231)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)