We are using DTCS have a 30 day window for them before they are cleaned up. I don't think with DTCS we can do anything about table sizing. Please do let me know if there are other ideas.
On Sat, Feb 20, 2016 at 12:51 AM, Jaydeep Chovatia < chovatia.jayd...@gmail.com> wrote: > To me following three looks on higher side: > SSTable count: 1289 > > In order to reduce SSTable count see if you are compacting of not (If > using STCS). Is it possible to change this to LCS? > > > Number of keys (estimate): 345137664 (345M partition keys) > > I don't have any suggestion about reducing this unless you partition your > data. > > > Bloom filter space used, bytes: 493777336 (400MB is huge) > > If number of keys are reduced then this will automatically reduce bloom > filter size I believe. > > > > Jaydeep > > On Thu, Feb 18, 2016 at 7:52 PM, Anishek Agarwal <anis...@gmail.com> > wrote: > >> Hey all, >> >> @Jaydeep here is the cfstats output from one node. >> >> Read Count: 1721134722 >> >> Read Latency: 0.04268825050756254 ms. >> >> Write Count: 56743880 >> >> Write Latency: 0.014650376727851532 ms. >> >> Pending Tasks: 0 >> >> Table: user_stay_points >> >> SSTable count: 1289 >> >> Space used (live), bytes: 122141272262 >> >> Space used (total), bytes: 224227850870 >> >> Off heap memory used (total), bytes: 653827528 >> >> SSTable Compression Ratio: 0.4959736121441446 >> >> Number of keys (estimate): 345137664 >> >> Memtable cell count: 339034 >> >> Memtable data size, bytes: 106558314 >> >> Memtable switch count: 3266 >> >> Local read count: 1721134803 >> >> Local read latency: 0.048 ms >> >> Local write count: 56743898 >> >> Local write latency: 0.018 ms >> >> Pending tasks: 0 >> >> Bloom filter false positives: 40664437 >> >> Bloom filter false ratio: 0.69058 >> >> Bloom filter space used, bytes: 493777336 >> >> Bloom filter off heap memory used, bytes: 493767024 >> >> Index summary off heap memory used, bytes: 91677192 >> >> Compression metadata off heap memory used, bytes: 68383312 >> >> Compacted partition minimum bytes: 104 >> >> Compacted partition maximum bytes: 1629722 >> >> Compacted partition mean bytes: 1773 >> >> Average live cells per slice (last five minutes): 0.0 >> >> Average tombstones per slice (last five minutes): 0.0 >> >> >> @Tyler Hobbs >> >> we are using cassandra 2.0.15 so >> https://issues.apache.org/jira/browse/CASSANDRA-8525 shouldnt occur. >> Other problems looks like will be fixed in 3.0 .. we will mostly try and >> slot in an upgrade to 3.x version towards second quarter of this year. >> >> >> @Daemon >> >> Latencies seem to have higher ratios, attached is the graph. >> >> >> I am mostly trying to look at Bloom filters, because the way we do reads, >> we read data with non existent partition keys and it seems to be taking >> long to respond, like for 720 queries it takes 2 seconds, with all 721 >> queries not returning anything. the 720 queries are done in sequence of >> 180 queries each with 180 of them running in parallel. >> >> >> thanks >> >> anishek >> >> >> >> On Fri, Feb 19, 2016 at 3:09 AM, Jaydeep Chovatia < >> chovatia.jayd...@gmail.com> wrote: >> >>> How many partition keys exists for the table which shows this problem >>> (or provide nodetool cfstats for that table)? >>> >>> On Thu, Feb 18, 2016 at 11:38 AM, daemeon reiydelle <daeme...@gmail.com> >>> wrote: >>> >>>> The bloom filter buckets the values in a small number of buckets. I >>>> have been surprised by how many cases I see with large cardinality where a >>>> few values populate a given bloom leaf, resulting in high false positives, >>>> and a surprising impact on latencies! >>>> >>>> Are you seeing 2:1 ranges between mean and worse case latencies >>>> (allowing for gc times)? >>>> >>>> Daemeon Reiydelle >>>> On Feb 18, 2016 8:57 AM, "Tyler Hobbs" <ty...@datastax.com> wrote: >>>> >>>>> You can try slightly lowering the bloom_filter_fp_chance on your table. >>>>> >>>>> Otherwise, it's possible that you're repeatedly querying one or two >>>>> partitions that always trigger a bloom filter false positive. You could >>>>> try manually tracing a few queries on this table (for non-existent >>>>> partitions) to see if the bloom filter rejects them. >>>>> >>>>> Depending on your Cassandra version, your false positive ratio could >>>>> be inaccurate: https://issues.apache.org/jira/browse/CASSANDRA-8525 >>>>> >>>>> There are also a couple of recent improvements to bloom filters: >>>>> * https://issues.apache.org/jira/browse/CASSANDRA-8413 >>>>> * https://issues.apache.org/jira/browse/CASSANDRA-9167 >>>>> >>>>> >>>>> On Thu, Feb 18, 2016 at 1:35 AM, Anishek Agarwal <anis...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> We have a table with composite partition key with humungous >>>>>> cardinality, its a combination of (long,long). On the table we have >>>>>> bloom_filter_fp_chance=0.010000. >>>>>> >>>>>> On doing "nodetool cfstats" on the 5 nodes we have in the cluster we >>>>>> are seeing "Bloom filter false ratio:" in the range of 0.7 -0.9. >>>>>> >>>>>> I thought over time the bloom filter would adjust to the key space >>>>>> cardinality, we have been running the cluster for a long time now but >>>>>> have >>>>>> added significant traffic from Jan this year, which would not lead to >>>>>> writes in the db but would lead to high reads to see if are any values. >>>>>> >>>>>> Are there any settings that can be changed to allow better ratio. >>>>>> >>>>>> Thanks >>>>>> Anishek >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Tyler Hobbs >>>>> DataStax <http://datastax.com/> >>>>> >>>> >>> >> >