> > How would it be different from creating an actual real extra table instead?
There's nothing that warrants making the codebase more complex to > accomplish something it already does. As far as I was aware, the only point of static columns was to support the thrift ability to mutate and read them in the same expression, with atomicity and isolation. As to whether or not it is more complex, I'm not at all convinced that it would be. We have had a lot of unexpected special casing added to ensure they behave correctly (e.g. paging is broken), and have complicated the comparison/slice logic to accommodate them, so that it is harder to reason about (and to optimise). They also have very different compaction characteristics, so the complexity on the user is increased without their necessarily realising it. All told, it introduces a lot more subtlety of behaviour than there would be with a separate set of sstables, or perhaps a separate file attached to each sstable. Of course, we've already implemented it as a specialisation of the slice/comparator, I think because it seemed like the least frictional path to do so, but that doesn't mean it is the least complex. It does mean it's the least work (assuming we're now on top of the bugs), which is its own virtue. There are some advantages to having them managed separately, and advantages to having them combined. Combined, for small partitions, they can be read in the same seek. However for large partitions this is no longer true, and we may behave much worse by polluting the page cache with lots of unwanted data that is adjacent to the static columns. If they were managed separately, the page cache would be populated mostly with other static columns, which may be more likely of use. We could quite easily have a "static column" cache, also, and completely avoid merging them. Or at least we could easily read them with collectTimeOrderedData instead of collectAllData semantics. All told, it certainly isn't a terrible idea, and shouldn't be dismissed so readily. Personally I think in the long run whether or not we manage static columns together with non-static columns is dependent on if we intend to add tiered "static" columns (i.e., if each level of clustering component can have columns associated with it). If we do, we should definitely keep it all inline. If not, it probably permits a lot better behaviour to separate them, since it's easier to reason about and improve their distinct characteristics. On Fri, May 1, 2015 at 1:24 AM, graham sanderson <gra...@vast.com> wrote: > Well you lose the atomicity and isolation, but in this case that is > probably fine > > That said, in every interaction I’ve had with static columns, they seem to > be an odd duck (e.g. adding or complicating range slices), perhaps worthy > of their own code path and sstables. Just food for thought. > > > On Apr 30, 2015, at 7:13 PM, Jonathan Haddad <j...@jonhaddad.com> wrote: > > > > If you want it in a separate sstable, just use a separate table. There's > > nothing that warrants making the codebase more complex to accomplish > > something it already does. > > > > On Thu, Apr 30, 2015 at 5:07 PM graham sanderson <gra...@vast.com> > wrote: > > > >> Anyone here have an opinion; how realistic would it be to have a > separate > >> memtable/sstable for static columns? > >> > >> Begin forwarded message: > >> > >> *From: *Jonathan Haddad <j...@jonhaddad.com> > >> *Subject: **Re: DateTieredCompactionStrategy and static columns* > >> *Date: *April 30, 2015 at 3:55:46 PM CDT > >> *To: *u...@cassandra.apache.org > >> *Reply-To: *u...@cassandra.apache.org > >> > >> > >> I suspect this will kill the benefit of DTCS, but haven't tested it to > be > >> 100% here. > >> > >> The benefit of DTCS is that sstables are selected for compaction based > on > >> the age of the data, not their size. When you mix TTL'ed data and non > >> TTL'ed data, you end up screwing with the "drop the entire SSTable" > >> optimization. I don't believe this is any different just because you're > >> mixing in static columns. What I think will happen is you'll end up > with > >> an sstable that's almost entirely TTL'ed with a few static columns that > >> will never get compacted or dropped. Pretty much the worst scenario I > can > >> think of. > >> > >> > >> > >> On Thu, Apr 30, 2015 at 11:21 AM graham sanderson <gra...@vast.com> > wrote: > >> > >>> I have a potential use case I haven’t had a chance to prototype yet, > >>> which would normally be a good candidate for DTCS (i.e. data delivered > in > >>> order and a fixed TTL), however with every write we’d also be updating > some > >>> static cells (namely a few key/values in a static map<text.text> CQL > >>> column). There could also be explicit deletes of keys in the static > map, > >>> though that’s not 100% necessary. > >>> > >>> Since those columns don’t have TTL, without reading thru the code code > >>> and/or trying it, I have no idea what effect this has on DTCS (perhaps > it > >>> needs to use separate sstables for static columns). Has anyone tried > this. > >>> If not I eventually will and will report back. > >> > >> > >