On 2017-03-03 09:18 (-0800), Shravan Ch <chall...@outlook.com> wrote: > > nodetool compactionstats -H > pending tasks: 3 > compaction type keyspace table > completed total unit progress > Compaction system hints > 28.5 GB 92.38 GB bytes 30.85% > >
The hint buildup is also something that could have caused OOMs, too. Hints are stored for a given host in a single partition, which means it's common for a single row/partition to get huge if you have a single host flapping. If you see "Compacting large row" messages for the hint rows, I suspect you'll find that one of the hosts/rows is responsible for most of that 92GB of hints, which means when you try to deliver the hints, you'll read from a huge partition, which creates memory pressure (see: CASSANDRA-9754) leading to GC pauses (or ooms), which then causes you to flap, which causes you to create more hints, which causes an ugly spiral. In 3.0, hints were rewritten to avoid this problem, but short term, you may need to truncate your hints to get healthy (assuming it's safe for you to do so, where 'safe' is based on your read+write consistency levels).