FWIW you can skip 2.2 and go 2.1 -> 3.11. I would wait for 3.11.4 though.
On Fri, Feb 1, 2019 at 12:53 PM Carl Mueller <carl.muel...@smartthings.com.invalid> wrote: > Interesting. Now that we have semiautomated upgrades, we are going to > hopefully get everything to 3.11X once we get the intermediate hop to 2.2. > > I'm thinking we could also use sstable metadata markings + custom > compactors for things like multiple customers on the same table. So you > could sequester the data for a customer in their own sstables and then > queries could effectively be subdivided against only the sstables that had > that customer. Maybe the min and max would cover that, I'd have to look at > the details. > > On Thu, Jan 31, 2019 at 8:11 PM Jonathan Haddad <j...@jonhaddad.com> wrote: > > > In addition to what Jeff mentioned, there was an optimization in 3.4 that > > can significantly reduce the number of sstables accessed when a LIMIT > > clause was used. This can be a pretty big win with TWCS. > > > > > > > http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html > > > > On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa <jji...@gmail.com> wrote: > > > > > In my original TWCS talk a few years back, I suggested that people make > > > the partitions match the time window to avoid exactly what you’re > > > describing. I added that to the talk because my first team that used > TWCS > > > (the team for which I built TWCS) had a data model not unlike yours, > and > > > the read-every-sstable thing turns out not to work that well if you > have > > > lots of windows (or very large partitions). If you do this, you can fan > > out > > > a bunch of async reads for the first few days and ask for more as you > > need > > > to fill the page - this means the reads are more distributed, too, > which > > is > > > an extra bonus when you have noisy partitions. > > > > > > In 3.0 and newer (I think, don’t quote me in the specific version), the > > > sstable metadata has the min and max clustering which helps exclude > > > sstables from the read path quite well if everything in the table is > > using > > > timestamp clustering columns. I know there was some issue with this and > > RTs > > > recently, so I’m not sure if it’s current state, but worth considering > > that > > > this may be much better on 3.0+ > > > > > > > > > > > > -- > > > Jeff Jirsa > > > > > > > > > > On Jan 31, 2019, at 1:56 PM, Carl Mueller < > > carl.muel...@smartthings.com.invalid> > > > wrote: > > > > > > > > Situation: > > > > > > > > We use TWCS for a task history table (partition is user, column key > is > > > > timeuuid of task, TWCS is used due to tombstone TTLs that rotate out > > the > > > > tasks every say month. ) > > > > > > > > However, if we want to get a "slice" of tasks (say, tasks in the last > > two > > > > days and we are using TWCS sstable blocks of 12 hours). > > > > > > > > The problem is, this is a frequent user and they have tasks in ALL > the > > > > sstables that are organized by the TWCS into time-bucketed sstables. > > > > > > > > So Cassandra has to first read in, say 80 sstables to reconstruct the > > > row, > > > > THEN it can exclude/slice on the column key. > > > > > > > > Question: > > > > > > > > Or am I wrong that the read path needs to grab all relevant sstables > > > before > > > > applying column key slicing and this is possible? Admittedly we are > in > > > 2.1 > > > > for this table (we in the process of upgrading now that we have an > > > > automated upgrading program that seems to work pretty well) > > > > > > > > If my assumption is correct, then the compaction strategy knows as it > > > > writes the sstables what it is bucketing them as (and could encode in > > > > sstable metadata?). If my assumption about slicing is that the whole > > row > > > > needs reconstruction, if we had a perfect infinite monkey coding team > > > that > > > > could generate whatever we wanted within some feasibility, could we > > > provide > > > > special hooks to do sstable exclusion based on metadata if we know > that > > > > that the metadata will indicate exclusion/inclusion of columns based > on > > > > metadata? > > > > > > > > Goal: > > > > > > > > The overall goal would be to support exclusion of sstables from a > read > > > > path, in case we had compaction strategies hand-tailored for other > > > queries. > > > > Essentially we would be doing a first-pass bucketsort exclusion with > > the > > > > sstable metadata marking the buckets. This might aid support of > > superwide > > > > rows and paging through column keys if we allowed the table creator > to > > > > specify bucketing as flushing occurs. In general it appears query > > > > performance quickly degrades based on # sstables required for a > lookup. > > > > > > > > I still don't know the code nearly well enough to do patches, it > would > > > seem > > > > based on my looking at custom compaction strategies and the basic > read > > > path > > > > that this would be a useful extension for advanced users. > > > > > > > > The fallback would be a set of tables to serve as buckets and we span > > the > > > > buckets with queries when one bucket runs out. The tables rotate. > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > -- > > Jon Haddad > > http://www.rustyrazorblade.com > > twitter: rustyrazorblade > > >