Don’t forget about deleted and missing data. The bane of all on replica aggregation optimization’s.
> On Jan 14, 2018, at 12:07 AM, Jeff Jirsa <jji...@gmail.com> wrote: > > > You’re right it’s not stored in metadata now. Adding this to metadata isn’t > hard, it’s just hard to do it right where it’s useful to people with other > data models (besides yours) so it can make it upstream (if that’s your goal). > In particular the worst possible case is a table with no clustering key and a > single non-partition key column. In that case storing these extra two long > time stamps may be 2-3x more storage than without, which would be a huge > regression, so you’d have to have a way to turn that feature off. > > > Worth mentioning that there are ways to do this without altering Cassandra - > consider using static columns that represent the min timestamp and max > timestamp. Create them both as ints or longs and write them on all > inserts/updates (as part of a batch, if needed). The only thing you’ll have > to do is find a way for “min timestamp” to work - you can set the min time > stamp column with an explicit “using timestamp” timestamp = 2^31-NOW, so > that future writes won’t overwrite those values. That gives you a first write > win behavior for that column, which gives you an effective min timestamp for > the partition as a whole. > > -- > Jeff Jirsa > > >> On Jan 13, 2018, at 4:58 AM, Arthur Kushka <arhel...@gmail.com> wrote: >> >> Hi folks, >> >> Currently, I working on custom CQL operator that should return the max >> timestamp for some partition. >> >> I don't think that scanning of partition for that kind of data is a nice >> idea. Instead of it, I thinking about adding a metadata to the partition. I >> want to store minTimestamp and maxTimestamp for every partition as it >> already done in Memtable`s. That timestamps will be updated on each >> mutation operation, that is quite cheap in comparison to full scan. >> >> I quite new to Cassandra codebase and want to get some critics and ideas, >> maybe that kind of data already stored somewhere or you have better ideas. >> Is my assumption right? >> >> Best, >> Artur > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org