I have to say that I have no idea on how to tune them. I discover the existence of bloom filters a few month ago and even after reading http://wiki.apache.org/cassandra/ArchitectureOverview#line-132 and http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html I am not sure what would be the impacts (positives and negatives) of tuning the bloom filters.
>From my reads I understand that with a bloom_filter_fp_chance > 0 I introduce a chance to get a false positive from a SSTable inducing eventually more latency while answering queries but using less memory. Is that right ? "What are your bloom filter settings on your CFs?" They are default (0 - which seems to mean fully enabled http://www.datastax.com/docs/1.1/configuration/storage_configuration#bloom-filter-fp-chance ) Cant they grow indefinitely or is there a threshold? Is there a way to "explore" the heap to be sure that bloom filters are causing this intensive use of the memory inside the heap before tuning them? >From http://www.datastax.com/docs/1.1/operations/tuning#tuning-bloomfilters : "For example, to run an analytics application that heavily scans a particular column family, you would want to inhibit or disable the Bloom filter on the column family by setting it high" Why would I do that, won't it slow the display of analytics? Alain 2012/11/7 Bryan <br...@appssavvy.com> > What are your bloom filter settings on your CFs? Maybe look here: > http://www.datastax.com/docs/1.1/operations/tuning#tuning-bloomfilters > > > > On Nov 7, 2012, at 4:56 AM, Alain RODRIGUEZ wrote: > > Hi, > > We just had some issue in production that we finally solve upgrading > hardware and increasing the heap. > > Now we have 3 xLarge servers from AWS (15G RAM, 4 cpu - 8 cores). We add > them and then removed the old ones. > > With full default configuration, 0.75 threshold of 4G was being reach > continuously, so I was obliged to increase the heap to 8G: > > Memtable : 2G (Manually configured) > Key cache : 0.1G (min(5% of Heap (in MB), 100MB)) > System : 1G (more or less, from datastax doc) > > It should use about 3 G and it actually use between 4 and 6 G. > > So here are my questions: > > How can we know how the heap is being used, monitor it ? > Why have I that much memory used in the heap of my new servers ? > > All configurations not specified are default from 1.1.2 Cassandra. > > Here is what happen to us before, why we change our hardware, if you have > any clue on what happen we would be glad to learn and maybe come back to > our old hardware. > > -------------------------------- User experience > ------------------------------------------------------------------------ > > We had a Cassandra 1.1.2 2 nodes cluster with RF2 and CL.ONE (R&W) running > on 2 m1.Large aws (7.5G RAM, 2 cpu - 4 cores dedicated to Cassandra only). > > Cassandra.yaml was configured with 1.1.2 default options and in > cassandra-env.sh I configured a 4G heap with a 200M "new size". > > That is the heap that was supposed to be used. > > Memtable : 1.4G (1/3 of the heap) > Key cache : 0.1G (min(5% of Heap (in MB), 100MB)) > System : 1G (more or less, from datastax doc) > > So we are around 2.5G max in theory out of 3G usable (threshold 0.75 of > the heap before flushing memtable because of pressure) > > I thought it was ok regarding Datastax documentation: > > "Regardless of how much RAM your hardware has, you should keep the JVM > heap size constrained by the following formula and allow the operating > system’s file cache to do the rest: > > (memtable_total_space_in_mb) + 1GB + (cache_size_estimate)" > After adding a third node and changing the RF from 2 to 3 (to allow using > CL.QUORUM and still be able to restart a node whenever we want), things > went really bad. Even if I still don't get how any of these operations > could possibly affect the heap needed. > > All the 3 nodes reached the 0.75 heap threshold (I tried to increase it to > 0.85, but it was still reached). And they never came down. So my cluster > started flushing a lot and the load increased because of > unceasing compactions. This unexpected load produced latency that broke > down our service for a while. Even with the service down, Cassandra was > unable to recover. > > >