I'm implementing a tool to determine whether the broker is in a healthy
state. There is a series of health checks that can be performed, starting
with the most basic and very rarely producing false positives, to
increasingly more comprehensive, intrusive, and opinionated that have a
higher probability of false positives.

In the following list there are some health checks grouped by target:
- node
  - up - check if a client can connect to the the node
  - disk - check if the disk hits the `max-disk-usage` limit
  - memory - check if the memory available to the JVM
  - backup - check if the backup node is announced
  - queues - check if all queues with a positive rate have a consumer
- queue
  - up - check if the queue exists
  - browser - check if the queue is browsable
  - consumer - check if a consumer can connect to the queue and/or receive
messages
  - producer - check if a producer can connect to the queue and/or send
messages

I would start with some of the previous checks, exposing them through the
MBeans interfaces and/or the Command Line utility.

What are your thoughts?

Domenico

Reply via email to