The bloom filter buckets the values in a small number of buckets. I have
been surprised by how many cases I see with large cardinality where a few
values populate a given bloom leaf, resulting in high false positives, and
a surprising impact on latencies!

Are you seeing 2:1 ranges between mean and worse case latencies (allowing
for gc times)?

Daemeon Reiydelle
On Feb 18, 2016 8:57 AM, "Tyler Hobbs" <ty...@datastax.com> wrote:

> You can try slightly lowering the bloom_filter_fp_chance on your table.
>
> Otherwise, it's possible that you're repeatedly querying one or two
> partitions that always trigger a bloom filter false positive.  You could
> try manually tracing a few queries on this table (for non-existent
> partitions) to see if the bloom filter rejects them.
>
> Depending on your Cassandra version, your false positive ratio could be
> inaccurate: https://issues.apache.org/jira/browse/CASSANDRA-8525
>
> There are also a couple of recent improvements to bloom filters:
> * https://issues.apache.org/jira/browse/CASSANDRA-8413
> * https://issues.apache.org/jira/browse/CASSANDRA-9167
>
>
> On Thu, Feb 18, 2016 at 1:35 AM, Anishek Agarwal <anis...@gmail.com>
> wrote:
>
>> Hello,
>>
>> We have a table with composite partition key with humungous cardinality,
>> its a combination of (long,long). On the table we have
>> bloom_filter_fp_chance=0.010000.
>>
>> On doing "nodetool cfstats" on the 5 nodes we have in the cluster we are
>> seeing  "Bloom filter false ratio:" in the range of 0.7 -0.9.
>>
>> I thought over time the bloom filter would adjust to the key space
>> cardinality, we have been running the cluster for a long time now but have
>> added significant traffic from Jan this year, which would not lead to
>> writes in the db but would lead to high reads to see if are any values.
>>
>> Are there any settings that can be changed to allow better ratio.
>>
>> Thanks
>> Anishek
>>
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Reply via email to