zzzming opened a new pull request, #815:
URL: https://github.com/apache/pulsar-client-go/pull/815
Implement #814
### Motivation
In a Pulsar cluster's kubernetes deployment or a deployment with Proxy/LB in
the front, we need metrics counter to track the re-connection failure producers
and consumers.
When brokers go offline but the proxy/LB is still functioning, TCP
connection can still be established but the topic look up failed.
pulsar_client_connections_establishment_errors counter is not incremented in
this case. Therefore new counters are required to track such failure cases.
### Modifications
Two new counter metrics `pulsar_client_producers_reconnect_failure` and
`pulsar_client_consumers_reconnect_failure` will be incremented at the
producer_partition and consumer_partition retry failure code block.
Because reconnecting to broker by producer/consumer creation has doubling
back off retry, to reduce excessive retry failure noise, these two counters
will only incremented by either of two conditions are met.
1. the max backoff retry is reached. This is a three minute window
2. Or MaxReconnectToBroker specified by the ProducerOption or ConsumerOption
(user can define) is reached
The existing code logic already covers the case when the topic does not
exist. The counters will not be pegged if the topic does not exist. It simply
exists from the retry loop at once.
### Verifying this change
This has been verified in the Pulsar cluster deployment with Proxy. We do
not have such set up in CI because it's not possible to test with Pulsar
standalone mode.
### Does this pull request potentially affect one of the following parts:
*If `yes` was chosen, please highlight the changes*
- Dependencies (does it add or upgrade a dependency): ( no)
- The public API: ( no)
- The schema: (no)
- The default values of configurations: ( no)
- The wire protocol: (no)
### Documentation
- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (not applicable / docs / GoDocs /
not documented)
- If a feature is not applicable for documentation, explain why?
- If a feature is not documented yet in this PR, please create a followup
issue for adding the documentation
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]