massakam opened a new issue, #22442: URL: https://github.com/apache/pulsar/issues/22442
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Motivation After upgrading the version of Pulsar used at our company, the `publishers` field of topic stats now includes producers whose names start with `pulsar.repl.`, which were not included before. These are producers created by brokers in other clusters for geo-replication. ```json "publishers" : [ { "accessMode" : "Shared", "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "averageMsgSize" : 0.0, "chunkedMessageRate" : 0.0, "producerId" : 0, "supportsPartialProducer" : false, "metadata" : { }, "address" : "/xxx.xxx.xxx.xxx:37102", "connectedSince" : "2024-04-04T14:23:33.666865+09:00", "clientVersion" : "2.10.5", "producerName" : "pulsar.repl.dev" } ], ``` Producers for geo-replication started to appear in `publishers` due to the following PR: https://github.com/apache/pulsar/pull/20229 Although this PR is treated as a bug fix, it seems that the exclusion of geo-replication producers from `publishers` was intentional, so I think this is a specification change. I think geo-replication producers should be excluded from `publishers`. These are unknown producers to normal users, so they confuse users who look at topic stats. I think that the mechanism for achieving geo-replication should be hidden from users as much as possible. ### Solution I would like to revert https://github.com/apache/pulsar/pull/20229. In addition to that, the `producerCount` field in broker stats also includes the number of geo-replication producers, so I'd like to exclude that from there too if possible. The `publishers` field in broker stats does not include geo-replication producers and is inconsistent. ```json "persistent://massakam/test/t1": { "publishers": [], "replication": { "dev": { "connected": true, "msgRateExpired": 0.0, "msgRateIn": 0.999, "msgRateOut": 0.0, "msgThroughputIn": 68.999, "msgThroughputOut": 0.0, "replicationBacklog": 0, "replicationDelayInSeconds": 0, "inboundConnection": "/xxx.xxx.xxx.xxx:37102", "inboundConnectedSince": "2024-04-04T14:23:33.666865+09:00", "outboundConnection": "[id: 0x291f051a, L:/xxx.xxx.xxx.xxx:37078 - R:dev-broker.pulsar.yahoo.co.jp/xxx.xxx.xxx.xxx:6651]", "outboundConnectedSince": "2024-03-22T11:14:40.557319+09:00" } }, "subscriptions": { "sub1": { "consumers": [], "msgBacklog": 317, "msgRateExpired": 0.0, "msgRateOut": 0.0, "messageAckRate": 0.0, "msgThroughputOut": 0.0, "msgRateRedeliver": 0.0, "numberOfEntriesSinceFirstNotAckedMessage": 1, "totalNonContiguousDeletedMessagesRange": 0, "type": "None" } }, "producerCount": 1, "averageMsgSize": 69.0, "msgRateIn": 0.999, "msgRateOut": 0.0, "msgInCount": 2651, "bytesInCount": 161400, "msgOutCount": 0, "bytesOutCount": 0, "msgThroughputIn": 68.999, "msgThroughputOut": 0.0, "storageSize": 19194, "backlogSize": 19194, "pendingAddEntriesCount": 0, "filteredEntriesCount": 0 } ``` ### Alternatives If https://github.com/apache/pulsar/pull/20229 should not be completely reverted, there is an option to add a query parameter to the REST APIs for retrieving topic stats/broker stats to determine whether `publishers` should include geo-replication producers. Even in this case, I'd like to hide geo-replication producers by default. I suggest `showReplicatorProducers` as the name of the query parameter. ### Anything else? _No response_ ### Are you willing to submit a PR? - [X] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
