sijie commented on issue #6518: health monitoring, alarms
URL: https://github.com/apache/pulsar/issues/6518#issuecomment-604309990
@ilyam8 it is just an example for your reference. For most of the people,
they define their alerting rules based on the metrics you can find on
https://pulsar.apache.org/docs/en/reference-metrics/. Some people might care
about write latency and some people might care about the backlog.
If you are looking more for failure-rate like metrics, currently only
bookkeeper has metrics about "success" and "failures". You can use them to
calculate the rate across the cluster.
For brokers, currently, it doesn't have such metrics. We can look into
adding these metrics.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org
With regards,
Apache Git Services