I 100% agree with Benedict, but just to be clear about my use case

1) We have state of lets say real estate listings
2) We get field level deltas for them
3) Previously we would store the base state all the deltas in partition and 
roll them up from the beginning of time (this was a prototype and silly since 
there was no expiration strategy)
4) Preferred plan is to keep current state in a static map (i.e. one delta 
field only updates one cell) - we are MVCC but in the common case the latest 
version will be what we want
5) However we require history, so we’d use the partition to keep TTL deltas 
going backwards from the now state - this seems like a common pattern people 
would want. Note also that sometimes we might need to apply reverse deltas if 
C* is ahead of our SOLR indexes

The static columns and the regular columns ARE completely different in 
behavior/lifecycle, so I’d definitely vote for them being treated as such.


> On May 1, 2015, at 7:27 AM, Benedict Elliott Smith 
> <belliottsm...@datastax.com> wrote:
> 
>> 
>> How would it be different from creating an actual real extra table instead?
> 
> 
> There's nothing that warrants making the codebase more complex to
>> accomplish something it already does.
> 
> 
> As far as I was aware, the only point of static columns was to support the
> thrift ability to mutate and read them in the same expression, with
> atomicity and isolation. As to whether or not it is more complex, I'm not
> at all convinced that it would be. We have had a lot of unexpected special
> casing added to ensure they behave correctly (e.g. paging is broken), and
> have complicated the comparison/slice logic to accommodate them, so that it
> is harder to reason about (and to optimise). They also have very different
> compaction characteristics, so the complexity on the user is increased
> without their necessarily realising it. All told, it introduces a lot more
> subtlety of behaviour than there would be with a separate set of sstables,
> or perhaps a separate file attached to each sstable.
> 
> Of course, we've already implemented it as a specialisation of the
> slice/comparator, I think because it seemed like the least frictional path
> to do so, but that doesn't mean it is the least complex. It does mean it's
> the least work (assuming we're now on top of the bugs), which is its own
> virtue.
> 
> There are some advantages to having them managed separately, and advantages
> to having them combined. Combined, for small partitions, they can be read
> in the same seek. However for large partitions this is no longer true, and
> we may behave much worse by polluting the page cache with lots of unwanted
> data that is adjacent to the static columns. If they were managed
> separately, the page cache would be populated mostly with other static
> columns, which may be more likely of use. We could quite easily have a
> "static column" cache, also, and completely avoid merging them. Or at least
> we could easily read them with collectTimeOrderedData instead of
> collectAllData semantics.
> 
> All told, it certainly isn't a terrible idea, and shouldn't be dismissed so
> readily. Personally I think in the long run whether or not we manage static
> columns together with non-static columns is dependent on if we intend to
> add tiered "static" columns (i.e., if each level of clustering component
> can have columns associated with it). If we do, we should definitely keep
> it all inline. If not, it probably permits a lot better behaviour to
> separate them, since it's easier to reason about and improve their distinct
> characteristics.
> 
> 
> On Fri, May 1, 2015 at 1:24 AM, graham sanderson <gra...@vast.com> wrote:
> 
>> Well you lose the atomicity and isolation, but in this case that is
>> probably fine
>> 
>> That said, in every interaction I’ve had with static columns, they seem to
>> be an odd duck (e.g. adding or complicating range slices), perhaps worthy
>> of their own code path and sstables. Just food for thought.
>> 
>>> On Apr 30, 2015, at 7:13 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:
>>> 
>>> If you want it in a separate sstable, just use a separate table.  There's
>>> nothing that warrants making the codebase more complex to accomplish
>>> something it already does.
>>> 
>>> On Thu, Apr 30, 2015 at 5:07 PM graham sanderson <gra...@vast.com>
>> wrote:
>>> 
>>>> Anyone here have an opinion; how realistic would it be to have a
>> separate
>>>> memtable/sstable for static columns?
>>>> 
>>>> Begin forwarded message:
>>>> 
>>>> *From: *Jonathan Haddad <j...@jonhaddad.com>
>>>> *Subject: **Re: DateTieredCompactionStrategy and static columns*
>>>> *Date: *April 30, 2015 at 3:55:46 PM CDT
>>>> *To: *u...@cassandra.apache.org
>>>> *Reply-To: *u...@cassandra.apache.org
>>>> 
>>>> 
>>>> I suspect this will kill the benefit of DTCS, but haven't tested it to
>> be
>>>> 100% here.
>>>> 
>>>> The benefit of DTCS is that sstables are selected for compaction based
>> on
>>>> the age of the data, not their size.  When you mix TTL'ed data and non
>>>> TTL'ed data, you end up screwing with the "drop the entire SSTable"
>>>> optimization.  I don't believe this is any different just because you're
>>>> mixing in static columns.  What I think will happen is you'll end up
>> with
>>>> an sstable that's almost entirely TTL'ed with a few static columns that
>>>> will never get compacted or dropped.  Pretty much the worst scenario I
>> can
>>>> think of.
>>>> 
>>>> 
>>>> 
>>>> On Thu, Apr 30, 2015 at 11:21 AM graham sanderson <gra...@vast.com>
>> wrote:
>>>> 
>>>>> I have a potential use case I haven’t had a chance to prototype yet,
>>>>> which would normally be a good candidate for DTCS (i.e. data delivered
>> in
>>>>> order and a fixed TTL), however with every write we’d also be updating
>> some
>>>>> static cells (namely a few key/values in a static map<text.text> CQL
>>>>> column). There could also be explicit deletes of keys in the static
>> map,
>>>>> though that’s not 100% necessary.
>>>>> 
>>>>> Since those columns don’t have TTL, without reading thru the code code
>>>>> and/or trying it, I have no idea what effect this has on DTCS (perhaps
>> it
>>>>> needs to use separate sstables for static columns). Has anyone tried
>> this.
>>>>> If not I eventually will and will report back.
>>>> 
>>>> 
>> 
>> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to