Hi Jeff Thanks a lot for all these details, they are really helpful. My understanding is that the number of windows is a tradeoff between the amount of data waiting for expiration and the number of sstables required to satisfy a read request.
In my case the data model does have a timestamp component. What is your recommendation for these cases? * TTL = 21 days, typical read span <= 2 days * TTL = 1300 days, typical read span 30 to 60 days śr., 28 wrz 2022 o 16:22 Jeff Jirsa <[email protected]> napisał(a): > So when I wrote TWCS, I wrote it for a use case that had 24h TTLs and 30 > days of retention. In that application, we had tested 12h windows, 24h > windows, and 7 day windows, and eventually settled on 24h windows because > that balanced factors like sstable size, sstables-per-read, and expired > data waiting to be dropped (about 3%, 1/30th, on any given day). That's > where that recommendation came from - it was mostly around how much expired > data will sit around waiting to be dropped. That doesn't change with > multiple data directories. > > If you go with fewer windows, you'll expire larger chunks at a time, which > means you'll retain larger chunks waiting on expiration. > If you go with more windows, you'll potentially touch more sstables on > read. > > Realistically, if you can model your data to align with chunks (so each > read only touches one window), the actual number of sstables shouldn't > really matter much - the timestamps and bloom filter will avoid touching > most of them on the read path anyway. If your data model doesnt have a > timestamp component to it and you're touching lots of sstables on read, > even 30 sstables is probably going to hurt you, and 210 would be really, > really bad. > > > > > > On Wed, Sep 28, 2022 at 7:00 AM Grzegorz Pietrusza <[email protected]> > wrote: > >> Hi All! >> >> According to TWCS documentation ( >> https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html) >> the operator should choose compaction window parameters to select a >> compaction_window_unit and compaction_window_size pair that produces >> approximately 20-30 windows. >> >> I'm curious where this recommendation comes from? Also should the number >> of windows be changed when more than one data directory is used? In my >> example there are 7 data directories (partitions) and it seems that all of >> them store 20-30 windows. Effectively this gives 140-210 sstables in total. >> Is that an optimal configuration? >> >> Running on Cassandra 3.11 >> >> Regards >> Grzegorz >> >
