On 2017-06-01 09:09 (-0700), "Harper, Paul" <[email protected]> wrote: > Hello All, > > I'm about 3 months into support several clusters of Cassandra databases. I > recently subscribed to this email list and I receive lots of interesting > emails most of which I don't understand. I feel like I have a pretty good > grasp on Cassandra, I would like to know what types of this should I be > checking on a daily, weekly or monthly basis. Many of the email I see in this > string are on subjects I've never had to look at so far. So I'm wondering > what is it that I should be monitoring or doing or I should know. I would > appreciate it any advice or guidance you can provide. Please to my email and > not the group listing unless it's something that maybe helpful to others. >
The good news is that cassandra can run for years without any intervention, especially if you're not pushing the limits. At a high level, you should be watching: - Read/writes per second. Your application may warn you if these change, but catching it before it impacts your application is always nice. - Latencies (how long does each read/write take, and is that getting worse over time, which may indicate a problem brewing) - How much data is on each node (hopefully it's pretty even) - How many sstables are on each node (hopefully it's pretty even) - GC pause times (you're probably using parnew/cms, most metrics packages will know how to graph those as two distinct lines - seeing long pauses is a good hint that things are starting to get bad) - How often are you running repair? Is repair succeeding? Is it failing? If you delete data, you need to repair (successfully, all nodes) at least once every gc_grace_seconds (by default 10 days). - Whether or not schema versions match - if schema diverges, you could have a big problem brewing. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
