[jira] [Updated] (CASSANDRA-4179) Add more general support for composites (to row key, column value)

Sylvain Lebresne (JIRA) Thu, 05 Jul 2012 01:18:40 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sylvain Lebresne updated CASSANDRA-4179:
----------------------------------------

    Attachment: 4179.txt

Attaching patch for row keys only using the tuple syntax (i.e. the first in the 
description above). Relating to my 2 points above:
* the number of components of the row key cannot be extended.
* thrift is not modified and if the key is composite, i.e. if we have more than 
one key alias, then we don't return any to thrift.

Other than that, the patch has a few limitations/missing parts:
* Taking the example in the description, you can do
{noformat}
SELECT * FROM timeline WHERE name = 'foo' AND month IN (1, 4, 8)
{noformat}
but there is no way to do a 'IN' on the full row key, mostly because we're 
lacking the syntax to express it. I.e. we would need something like:
{noformat}
SELECT * FROM timeline WHERE (name, month) IN (('foo', 1), ('bar', 4))
{noformat}
Now that is probably not a blocker per se for the patch to be committed, but I 
think we should at least figure out a syntax because that kind of 'IN' query 
can be convenient when you do manual indexing so imo it is worth supporting it 
(I know there is the old argument that IN query are not really useful and 
instead client drivers should handle the parallelism themselves, but I don't 
really agree, especially not for CQL3 where one of the goal is to remove burden 
from the client driver).
* For a very similar reason, the token() function from CASSANDRA-3771 is not 
supported on composite keys. One similar syntax for that could be
{noformat}
SELECT * FROM timeline WHERE token(name, month) > token('bar', 4)
SELECT * FROM timeline WHERE token(name, month) > "19389135324729031"
{noformat}
Opinion?

                
> Add more general support for composites (to row key, column value)
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4179
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4179
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: API
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>         Attachments: 4179.txt
>
>
> Currently CQL3 have a nice syntax for using composites in the column name 
> (it's more than that in fact, it creates a whole new abstraction but let's 
> say I'm talking implementation here). There is however 2 other place where 
> composites could be used (again implementation wise): the row key and the 
> column value. This ticket proposes to explore which of those make sense for 
> CQL3 and how.
> For the row key, I really think that CQL support makes sense. It's very 
> common (and useful) to want to stuff composite information in a row key. 
> Sharding a time serie (CASSANDRA-4176) is probably the best example but there 
> is other.
> For the column value it is less clear. CQL3 makes it very transparent and 
> convenient to store multiple related values into multiple columns so maybe 
> composites in a column value is much less needed. I do still see two cases 
> for which it could be handy:
> # to save some disk/memory space, if you do know it makes no sense to 
> insert/read two value separatly.
> # if you want to enforce that two values should not be inserted separatly. 
> I.e. to enforce a form of "constraint" to avoid programatic error.
> Those are not widely useful things, but my reasoning is that if whatever 
> syntax we come up for "grouping" row key in a composite trivially extends to 
> column values, why not support it.
> As for syntax I have 3 suggestions (that are just that, suggestions):
> # If we only care about allowing grouping for row keys:
> {noformat}
> CREATE TABLE timeline (
>     name text,
>     month int,
>     ts timestamp,
>     value text,
>     PRIMARY KEY ((name, month), ts)
> )
> {noformat}
> # A syntax that could work for both grouping in row key and colum value:
> {noformat}
> CREATE TABLE timeline (
>     name text,
>     month int,
>     ts timestamp,
>     value1 text,
>     value2 text,
>     GROUP (name, month) as key,
>     GROUP (value1, value2),
>     PRIMARY KEY (key, ts)
> )
> {noformat}
> # An alternative to the preceding one:
> {noformat}
> CREATE TABLE timeline (
>     name text,
>     month int,
>     ts timestamp,
>     value1 text,
>     value2 text,
>     GROUP (name, month) as key,
>     GROUP (value1, value2),
>     PRIMARY KEY (key, ts)
> ) WITH GROUP (name, month) AS key
>    AND GROUP (value1, value2)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4179) Add more general support for composites (to row key, column value)

Reply via email to