We have Collectd (http://www.collectd.org/) monitoring Cassandra via its Java/JMX plugin. Collectd feeds data to a central Graphite/Carbon (http://graphite.wikidot.com/start) instance via https://github.com/indygreg/collectd-carbon . With Graphite, you can effortlessly utilize the web UI (or HTTP API) to build and save graph definitions that sum/display/etc related values over the whole cluster. You can also utilize Graphite's HTTP API to export raw data. Your monitoring infrastructure could then poll this for alerting.
I have a script that parses a storage-conf.xml file into a Collectd config snippet. But, I don't have that posted in public domain at the moment. In lieu of that, here are some samples that work with Cassandra 0.6.12: Add the following to a Collectd types file: cassandra_pool active:GAUGE:0:U, pending:GAUGE:0:U, completed:COUNTER:0:U cassandra_stage active:GAUGE:0:2147483648, pending:GAUGE:0:2147483648, completed:COUNTER:0:U cassandra_cf_cache rcnt_hit_rate:GAUGE:0:1, size:GAUGE:0:2147483648, capacity:GAUGE:0:2147483648, hits:COUNTER:0:U, requests:COUNTER:0:U cassandra_cf_store pending_tasks:GAUGE:0:2147483648, min_row_size:GAUGE:0:U, max_row_size:GAUGE:0:U, mean_row_size:GAUGE:0:U, memtbl_col_cnt:GAUGE:0:U, memtbl_data_size:GAUGE:0:U, memtbl_switch_cnt:COUNTER:0:U, read_cnt:COUNTER:0:U, rcnt_rd_latency:GAUGE:0:2147483648, tot_rd_latency:COUNTER:0:U, write_cnt:COUNTER:0:U, rcnt_wr_latency:GAUGE:0:2147483648, tot_wr_latency:COUNTER:0:U, disk_used_total:GAUGE:0:U, disk_used_live:GAUGE:0:U, ss_table_count:GAUGE:0:1000000, bloom_false_pos:COUNTER:0:U, bloom_rcnt_f_ratio:GAUGE:0:1, bloom_false_ratio:GAUGE:0:1 cassandra_compaction_manager pending:GAUGE:0:U, bytes_in_progress:GAUGE:0:U, bytes_compacted:GAUGE:0:U cassandra_storage_proxy rcnt_rd_latency:GAUGE:0:2147483648, tot_rd_latency:COUNTER:0:U, rcnt_wr_latency:GAUGE:0:2147483648, tot_wr_latency:COUNTER:0:U, read_operations:COUNTER:0:U, range_operations:COUNTER:0:U tot_rg_latency:COUNTER:0:U rcnt_rg_latency:GAUGE:0:2147483648 write_operations:COUNTER:0:U (The weird names are due to a character length limitation in Collectd, which enforces the restrictions of RRD, since it uses that out of the box.) And add the following to your Collectd config file: <Plugin "java"> JVMArg "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/share/collectd/java/generic-jmx.jar" LoadPlugin "org.collectd.java.GenericJMX" <Plugin "GenericJMX"> <!-- this will read a stage mbean --> <MBean "cassandra-row-read-stage"> ObjectName "org.apache.cassandra.concurrent:type=ROW-READ-STAGE" InstancePrefix "cassandra_row_read_stage" <Value> Type "cassandra_stage" Attribute "ActiveCount" Attribute "PendingTasks" Attribute "CompletedTasks" </Value> </MBean> <!-- this will read a specific column family mbean --> <MBean "cassandra-cf-foo"> ObjectName "org.apache.cassandra.db:columnfamily=Foo,keyspace=KeySpace,type=ColumnFamilyStores" InstancePrefix "cassandra_cf_foo" <Value> Type "cassandra_cf_store" Attribute "PendingTasks" Attribute "MinRowCompactedSize" Attribute "MaxRowCompactedSize" Attribute "MeanRowCompactedSize" Attribute "MemtableColumnsCount" Attribute "MemtableDataSize" Attribute "MemtableSwitchCount" Attribute "ReadCount" Attribute "RecentReadLatencyMicros" Attribute "TotalReadLatencyMicros" Attribute "WriteCount" Attribute "RecentWriteLatencyMicros" Attribute "TotalWriteLatencyMicros" Attribute "TotalDiskSpaceUsed" Attribute "LiveDiskSpaceUsed" Attribute "LiveSSTableCount" Attribute "BloomFilterFalsePositives" Attribute "RecentBloomFilterFalseRatio" Attribute "BloomFilterFalseRatio" </Value> </MBean> <!-- this defines what to connect to and what to collect. I /think/ you can define multiple connections to monitor many Cassandra instances from one Collectd instance, but haven't tried this --> <Connection> Host "cassandra" Collect "cassandra-row-read-stage" Collect "cassandra-cf-foo" ServiceURL "service:jmx:rmi:///jndi/rmi://localhost:8080/jmxrmi" </Connection> </Plugin> </Plugin> Greg > -----Original Message----- > From: mcasandra [mailto:mohitanch...@gmail.com] > Sent: Thursday, March 24, 2011 11:45 AM > To: cassandra-u...@incubator.apache.org > Subject: Central monitoring of Cassandra cluster > > Can someone share if they have centralized monitoring for all cassandra > servers. With many nodes it becomes difficult to monitor them individually > unless we can look at data in one place. I am looking at solutions where this > can be done. Looking at Cacti currently but not sure how to integrate it with > JMX. > > -- > View this message in context: http://cassandra-user-incubator-apache- > org.3065146.n2.nabble.com/Central-monitoring-of-Cassandra-cluster- > tp6205275p6205275.html > Sent from the cassandra-u...@incubator.apache.org mailing list archive at > Nabble.com.