Hi all, we are using HBase in a couple of our clusters and under heavy loads when something bad happens we do not have any clues on what happens and why. Ganglia and Nagios are not helping. We created a dashboard with the ELK stack using the metrics exposed by the region servers and the master servers but I'm wondering what's the standard in these cases.
What do you use in your production cluster to monitor HBase and why do you think it's better than the alternatives? Kind Regards
