To me following three looks on higher side: SSTable count: 1289 In order to reduce SSTable count see if you are compacting of not (If using STCS). Is it possible to change this to LCS?
Number of keys (estimate): 345137664 (345M partition keys) I don't have any suggestion about reducing this unless you partition your data. Bloom filter space used, bytes: 493777336 (400MB is huge) If number of keys are reduced then this will automatically reduce bloom filter size I believe. Jaydeep On Thu, Feb 18, 2016 at 7:52 PM, Anishek Agarwal <anis...@gmail.com> wrote: > Hey all, > > @Jaydeep here is the cfstats output from one node. > > Read Count: 1721134722 > > Read Latency: 0.04268825050756254 ms. > > Write Count: 56743880 > > Write Latency: 0.014650376727851532 ms. > > Pending Tasks: 0 > > Table: user_stay_points > > SSTable count: 1289 > > Space used (live), bytes: 122141272262 > > Space used (total), bytes: 224227850870 > > Off heap memory used (total), bytes: 653827528 > > SSTable Compression Ratio: 0.4959736121441446 > > Number of keys (estimate): 345137664 > > Memtable cell count: 339034 > > Memtable data size, bytes: 106558314 > > Memtable switch count: 3266 > > Local read count: 1721134803 > > Local read latency: 0.048 ms > > Local write count: 56743898 > > Local write latency: 0.018 ms > > Pending tasks: 0 > > Bloom filter false positives: 40664437 > > Bloom filter false ratio: 0.69058 > > Bloom filter space used, bytes: 493777336 > > Bloom filter off heap memory used, bytes: 493767024 > > Index summary off heap memory used, bytes: 91677192 > > Compression metadata off heap memory used, bytes: 68383312 > > Compacted partition minimum bytes: 104 > > Compacted partition maximum bytes: 1629722 > > Compacted partition mean bytes: 1773 > > Average live cells per slice (last five minutes): 0.0 > > Average tombstones per slice (last five minutes): 0.0 > > > @Tyler Hobbs > > we are using cassandra 2.0.15 so > https://issues.apache.org/jira/browse/CASSANDRA-8525 shouldnt occur. > Other problems looks like will be fixed in 3.0 .. we will mostly try and > slot in an upgrade to 3.x version towards second quarter of this year. > > > @Daemon > > Latencies seem to have higher ratios, attached is the graph. > > > I am mostly trying to look at Bloom filters, because the way we do reads, > we read data with non existent partition keys and it seems to be taking > long to respond, like for 720 queries it takes 2 seconds, with all 721 > queries not returning anything. the 720 queries are done in sequence of > 180 queries each with 180 of them running in parallel. > > > thanks > > anishek > > > > On Fri, Feb 19, 2016 at 3:09 AM, Jaydeep Chovatia < > chovatia.jayd...@gmail.com> wrote: > >> How many partition keys exists for the table which shows this problem (or >> provide nodetool cfstats for that table)? >> >> On Thu, Feb 18, 2016 at 11:38 AM, daemeon reiydelle <daeme...@gmail.com> >> wrote: >> >>> The bloom filter buckets the values in a small number of buckets. I have >>> been surprised by how many cases I see with large cardinality where a few >>> values populate a given bloom leaf, resulting in high false positives, and >>> a surprising impact on latencies! >>> >>> Are you seeing 2:1 ranges between mean and worse case latencies >>> (allowing for gc times)? >>> >>> Daemeon Reiydelle >>> On Feb 18, 2016 8:57 AM, "Tyler Hobbs" <ty...@datastax.com> wrote: >>> >>>> You can try slightly lowering the bloom_filter_fp_chance on your table. >>>> >>>> Otherwise, it's possible that you're repeatedly querying one or two >>>> partitions that always trigger a bloom filter false positive. You could >>>> try manually tracing a few queries on this table (for non-existent >>>> partitions) to see if the bloom filter rejects them. >>>> >>>> Depending on your Cassandra version, your false positive ratio could be >>>> inaccurate: https://issues.apache.org/jira/browse/CASSANDRA-8525 >>>> >>>> There are also a couple of recent improvements to bloom filters: >>>> * https://issues.apache.org/jira/browse/CASSANDRA-8413 >>>> * https://issues.apache.org/jira/browse/CASSANDRA-9167 >>>> >>>> >>>> On Thu, Feb 18, 2016 at 1:35 AM, Anishek Agarwal <anis...@gmail.com> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> We have a table with composite partition key with humungous >>>>> cardinality, its a combination of (long,long). On the table we have >>>>> bloom_filter_fp_chance=0.010000. >>>>> >>>>> On doing "nodetool cfstats" on the 5 nodes we have in the cluster we >>>>> are seeing "Bloom filter false ratio:" in the range of 0.7 -0.9. >>>>> >>>>> I thought over time the bloom filter would adjust to the key space >>>>> cardinality, we have been running the cluster for a long time now but have >>>>> added significant traffic from Jan this year, which would not lead to >>>>> writes in the db but would lead to high reads to see if are any values. >>>>> >>>>> Are there any settings that can be changed to allow better ratio. >>>>> >>>>> Thanks >>>>> Anishek >>>>> >>>> >>>> >>>> >>>> -- >>>> Tyler Hobbs >>>> DataStax <http://datastax.com/> >>>> >>> >> >