[ 
https://issues.apache.org/jira/browse/CASSANDRA-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898963#comment-13898963
 ] 

Sylvain Lebresne commented on CASSANDRA-6561:
---------------------------------------------

bq. ALTER TABLE ADD should support adding a static column, but doesn't

Right, pushed an additional commit for that on the same branch than before 
(https://github.com/pcmanus/cassandra/commits/6561-3). I updated the dtest for 
that too.

bq. dropping a static column doesn't work fully (it won't be compacted away)

You might have to be more specific than that. As far as I can tell, there is 
nothing special that should be done for static columns outside of making sure 
the column name does get added to droppedColumns and that's the case. I 
confirmed that with a quick manual test too: unless sstable2json is lying to 
me, the dropped static columns does got compacted away.

> Static columns in CQL3
> ----------------------
>
>                 Key: CASSANDRA-6561
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6561
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 2.0.6
>
>
> I'd like to suggest the following idea for adding "static" columns to CQL3.  
> I'll note that the basic idea has been suggested by jhalliday on irc but the 
> rest of the details are mine and I should be blamed for anything stupid in 
> what follows.
> Let me start with a rational: there is 2 main family of CF that have been 
> historically used in Thrift: static ones and dynamic ones. CQL3 handles both 
> family through the presence or not of clustering columns. There is however 
> some cases where mixing both behavior has its use. I like to think of those 
> use cases as 3 broad category:
> # to denormalize small amounts of not-entirely-static data in otherwise 
> static entities. It's say "tags" for a product or "custom properties" in a 
> user profile. This is why we've added CQL3 collections. Importantly, this is 
> the *only* use case for which collections are meant (which doesn't diminishes 
> their usefulness imo, and I wouldn't disagree that we've maybe not 
> communicated this too well).
> # to optimize fetching both a static entity and related dynamic ones. Say you 
> have blog posts, and each post has associated comments (chronologically 
> ordered). *And* say that a very common query is "fetch a post and its 50 last 
> comments". In that case, it *might* be beneficial to store a blog post 
> (static entity) in the same underlying CF than it's comments for performance 
> reason.  So that "fetch a post and it's 50 last comments" is just one slice 
> internally.
> # you want to CAS rows of a dynamic partition based on some partition 
> condition. This is the same use case than why CASSANDRA-5633 exists for.
> As said above, 1) is already covered by collections, but 2) and 3) are not 
> (and
> I strongly believe collections are not the right fit, API wise, for those).
> Also, note that I don't want to underestimate the usefulness of 2). In most 
> cases, using a separate table for the blog posts and the comments is The 
> Right Solution, and trying to do 2) is premature optimisation. Yet, when used 
> properly, that kind of optimisation can make a difference, so I think having 
> a relatively native solution for it in CQL3 could make sense.
> Regarding 3), though CASSANDRA-5633 would provide one solution for it, I have 
> the feeling that static columns actually are a more natural approach (in term 
> of API). That's arguably more of a personal opinion/feeling though.
> So long story short, CQL3 lacks a way to mix both some "static" and "dynamic" 
> rows in the same partition of the same CQL3 table, and I think such a tool 
> could have it's use.
> The proposal is thus to allow "static" columns. Static columns would only 
> make sense in table with clustering columns (the "dynamic" ones). A static 
> column value would be static to the partition (all rows of the partition 
> would share the value for such column). The syntax would just be:
> {noformat}
> CREATE TABLE t (
>   k text,
>   s text static,
>   i int,
>   v text,
>   PRIMARY KEY (k, i)
> )
> {noformat}
> then you'd get:
> {noformat}
> INSERT INTO t(k, s, i, v) VALUES ("k0", "I'm shared",       0, "foo");
> INSERT INTO t(k, s, i, v) VALUES ("k0", "I'm still shared", 1, "bar");
> SELECT * FROM t;
>  k |                  s | i |    v
> ------------------------------------
> k0 | "I'm still shared" | 0 | "bar"
> k0 | "I'm still shared" | 1 | "foo"
> {noformat}
> There would be a few semantic details to decide on regarding deletions, ttl, 
> etc. but let's see if we agree it's a good idea first before ironing those 
> out.
> One last point is the implementation. Though I do think this idea has merits, 
> it's definitively not useful enough to justify rewriting the storage engine 
> for it. But I think we can support this relatively easily (emphasis on 
> "relatively" :)), which is probably the main reason why I like the approach.
> Namely, internally, we can store static columns as cells whose clustering 
> column values are empty. So in terms of cells, the partition of my example 
> would look like:
> {noformat}
> "k0" : [
>   (:"s" -> "I'm still shared"), // the static column
>   (0:"" -> "")                  // row marker
>   (0:"v" -> "bar")
>   (1:"" -> "")                  // row marker
>   (1:"v" -> "foo")
> ]
> {noformat}
> Of course, using empty values for the clustering columns doesn't quite work 
> because it could conflict with the user using empty clustering columns. But 
> in the CompositeType encoding we have the end-of-component byte that we could 
> reuse by using a specific value (say 0xFF, currently we never set that byte 
> to anything else than -1, 0 and 1) to indicate it's a static column.
> With that, we'd need to update the CQL3 statements to support the new syntax 
> and rules, but that's probably not horribly hard.
> So anyway, this may or may not be a good idea, but I think it has enough meat 
> to warrant some consideration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to