Something I've been monitoring lately is heap memory because it can cause a lot of problems if you hit the JVM max memory limit. Usually what happens when you do is GarbageCollection will run when it hits the max memory and it will peg one of the cores on the CPU to 100%. That causes clients who are connecting and actively reading and writing to timeout if you have enough volume as they appear to be blocked.
So far I've diagnosed my issues to that and I am thinking about ways to resolve it. My thought process right now is that KeyCachedFraction and RowCache values in a ColumnFamily will have a huge impact on this. If there's anything else, I'd love to know. For my needs, our rows are not wide but I have millions, potentially billions one day. Running on two 8G machines and I am trying to figure out how to keep the heap memory level and not increasing with time. Any suggestions or advice would be most helpful. Currently I am running an experiment with RowCache set to 0 and KeyCachedFraction at 0.3. I am assuming KeyCachedFraction will cache up to 30% of all keys on that given node. If my assumption is right that those two parameters are the main contributing factors in increased heap memory then it sounds like you really just have to play with it and monitor it. Suhail