We are running Cassandra 2.2.3, 2 data centers, 3 nodes in each. The 
replication factor per datacenter is 3. The Xmx setting on the Cassandra 
JVMs is 4GB.

We have a workload that generates loots of tombstones and Cassandra goes 
OOM in about 24 hours. We've adjusted the tombstone_failure_threshold down 
to 25000 but we never see the TombstoneOverwhelmingException before the 
nodes start going OOM.

The table operation that looks to be the culprit is a scan of partition 
keys (i.e. we are scanning across narrow rows, not scanning within a wide 
row). The heapdump shows we have a RangeSliceReply containing an ArrayList 
with 1,823,230 org.apache.cassandra.db.Row objects with a retained heap 
size of 441MiB.  A look inside one of the Row objects shows an 
org.apache.cassandra.db.DeletionInfo object so I assume that means the row 
has been tombstoned.

If all of the 1,823,239 Row objects are tombstoned (and it is likely that 
most of them are), is there a reason that the 
TombstoneOverwhelmingException never gets thrown? 



Regards,

Rick (R.) Gunderson 
Software Engineer
IBM Commerce, B2B Development - GDHA


Phone: 1-250-220-1053 
E-mail: rgunder...@ca.ibm.com 
Find me on:  


1803 Douglas St
Victoria, BC V8T 5C3 
Canada 


Reply via email to