LinkedIn has a custom monitoring system partially described here: http://engineering.linkedin.com/52/autometrics-self-service-metrics-collection
The integration from the kafka side is basically just jmx, though we have a few wrappers that expose additional things. We measure basic stuff like disk stats, messages/sec, latency, etc. In addition we due a very kafka specific kind of monitoring we call "audit". This counts the number of messages sent by every producer, received by every broker, and received by every consumer and reconciles and graphs and alerts on these counts. This is very helpful in determining that all the sent data arrived at its destination. There is a bug open to open source this piece, though it has a few dependencies. https://issues.apache.org/jira/browse/KAFKA-260 -Jay On Fri, Jul 27, 2012 at 6:00 PM, Jonathan Creasy <jcre...@box.com> wrote: > How do you guys monitor Kafka? Do any of you have Nagios checks that you > use? What metrics do you find important? >