Victor Xu created AMBARI-18624:
Summary: Deeper Alerting for Kafka
Issue Type: Improvement
Affects Versions: 2.2.2
Reporter: Victor Xu
Kafka Brokers can become unhealthy while still being 'available', meaning the
process is up and the service is running but it is unusable. Current alerting
is only focussed on if the process is up and running, but it's desired to have
alerting that is focussed on testing for these situations in which the
component is up, but non functional.
Sometimes the brokers are not working but ambari still registers them as green.
I have seen this in at least one case, OutOfMemory in the broker logs. This
could easily be reproduced by dropping the broker heap and creating many topics
to blow the heap. Ambari will still register kafka as healthy.
We need the Kafka team to identify a script that Ambari can run to probe the
Kafka process to achieve this deeper health alerting capability.
This message was sent by Atlassian JIRA