@Aaron: Small side question, when do columns with a past TTL get removed? On a repair, (minor) compaction, or .. ? Does it have a performance drop if that's happening?
2012/1/2 aaron morton <[email protected]> > Even if you had compaction enforcing a limit on the number of columns in a > row, there would still be issues with concurrent writes at the same time > and with read-repair. i.e. node a says the this is the first n columns but > node b says something else, you only know who is correct at read time. > > Have you considered using a TTL on the columns ? > > Depending on the use case you could also consider have writes periodically > or randomly trim the data size, or trim on reads. > > It will also make sense to partition the time series data into different > rows, and Viva la Standard Column Families! > > Hope that helps. > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 25/12/2011, at 7:48 PM, Praveen Baratam wrote: > > Hello Everybody, > > Happy Christmas. > > I know that this topic has come up quiet a few times on Dev and User lists > but did not culminate into a solution. > > http://www.mail-archive.com/[email protected]/msg15367.html > > The above discussion on User list talks about AbstractCompactionStrategy > but I could not find any relevant documentation as its a fairly new feature > in Cassandra. > > Let me state this necessity and use-case again. > > I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to > approximately limit the number of columns to "n". "n" can vary a lot and > the intention is to throw away stale data and not to maintain any hard > limit on the CF or SC. Its very useful for storing time-series data where > stale data is not necessary. The goal is to achieve this with minimum > overhead and since compaction happens all the time it would be clever to > implement it as part of compaction. > > Thanks in advance. > > Praveen > > >
