The bloom filter buckets the values in a small number of buckets. I have been surprised by how many cases I see with large cardinality where a few values populate a given bloom leaf, resulting in high false positives, and a surprising impact on latencies!
Are you seeing 2:1 ranges between mean and worse case latencies (allowing for gc times)? Daemeon Reiydelle On Feb 18, 2016 8:57 AM, "Tyler Hobbs" <ty...@datastax.com> wrote: > You can try slightly lowering the bloom_filter_fp_chance on your table. > > Otherwise, it's possible that you're repeatedly querying one or two > partitions that always trigger a bloom filter false positive. You could > try manually tracing a few queries on this table (for non-existent > partitions) to see if the bloom filter rejects them. > > Depending on your Cassandra version, your false positive ratio could be > inaccurate: https://issues.apache.org/jira/browse/CASSANDRA-8525 > > There are also a couple of recent improvements to bloom filters: > * https://issues.apache.org/jira/browse/CASSANDRA-8413 > * https://issues.apache.org/jira/browse/CASSANDRA-9167 > > > On Thu, Feb 18, 2016 at 1:35 AM, Anishek Agarwal <anis...@gmail.com> > wrote: > >> Hello, >> >> We have a table with composite partition key with humungous cardinality, >> its a combination of (long,long). On the table we have >> bloom_filter_fp_chance=0.010000. >> >> On doing "nodetool cfstats" on the 5 nodes we have in the cluster we are >> seeing "Bloom filter false ratio:" in the range of 0.7 -0.9. >> >> I thought over time the bloom filter would adjust to the key space >> cardinality, we have been running the cluster for a long time now but have >> added significant traffic from Jan this year, which would not lead to >> writes in the db but would lead to high reads to see if are any values. >> >> Are there any settings that can be changed to allow better ratio. >> >> Thanks >> Anishek >> > > > > -- > Tyler Hobbs > DataStax <http://datastax.com/> >