[jira] [Updated] (CASSANDRA-2474) CQL support for compound columns

Sylvain Lebresne (Updated) (JIRA) Tue, 20 Dec 2011 06:30:01 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sylvain Lebresne updated CASSANDRA-2474:
----------------------------------------

    Attachment: raw_composite.txt

bq. True, but none of the other proposals even come close to being as friendly 
as this one for typical cases

Playing devils advocate I would say that 'sucking much less' doesn't 
necessarily make it 'the right solution'.

Now, don't get me wrong, I like the TRANSPOSED idea for composite. But I think 
you made 2 proposals:
# a reasonably generic way to access CF with composite comparator in a CQL-ish 
way: the TRANSPOSED part.
# an attempt at some special handling for the case of composites where the last 
component takes only a know number of values: the SPARSE thing.

I do like the first part. Though I'd like to mention some remarks on the 
following comment:

bq. We're using TRANSPOSED AS similarly to how databases have used storage 
hints like CLUSTERED. It doesn't affect the relational model of the data, but 
it gives you different performance characteristics

While I understand what you mean, I don't think it's completely true. Because 
in the transposed case, the order of definition matters, which has a 
consequence on what you can do, both in terms of writes and reads. 
Consider the two definitions:
{noformat}
CREATE TABLE test1 (
    key text primary key,
    prop1 int,
    prop2 int,
    prop3 int
)
{noformat}
and
{noformat}
CREATE TABLE test2 (
    key int primary key,
    prop1 int,
    prop2 int,
    prop3 int
) TRANSPOSED AS (prop1, prop2)
{noformat}
Those two definitions don't only differ from a performance standpoint. 
Typically, you can do
{noformat}
UPDATE test1 SET prop2 = 42 WHERE key = 'someKey';
{noformat}
but you cannot do the same query on test2. Btw, for test2, you don't 
necessarily have to specify prop2 for every row, but you need at least prop1 
and prop3 each time. My point being that you do have to understand a bit what 
is going on underneath to understand the limitation we will have to put on this.

You also have the similar thing for gets: you can do
{noformat}
SELECT prop2 FROM test1 WHERE key = 'someKey';
{noformat}
but this make no sense with test2 (or rather there is no way we can do this 
efficiently, i.e, without reading the row fully).

That being said, I'll reiter that I'm reasonably convinced by this 
transposition notion, even though I'll probably prefer to write it as
{noformat}
CREATE TRANSPOSED TABLE test2 (
    key int primary key,
    prop1 int,
    prop2 int,
    prop3 int
)
{noformat}
as was suggested in some comments above.

On the SPARSE thing, I am much less convinced that this is the right solution. 
I think that having at the same 'level' variables that are just names to 
identify values in the resultset (posted_at) and literals (posted_by) is 
confusing (and ugly). (As a side note, I don't "understand" the choice of the 
SPARSE word).

Overall, I'm afraid we'll end up doing some bad choice by trying to do too much 
at once. The first problem we have is that CQL, that we'd like to push as the 
de-facto way to access Cassandra, doesn't allow access to composite columns at 
all. It seems to me that the transposed alone fixes that (again, except for the 
dynamic composite type). The SPARSE don't add any new possibility, it just adds 
a presumably better syntax for a specific case. I would be in favor of moving 
this to a second step (which would be less urgent and would allow refocusing 
the discussion on that very specific optimisation). 

Lastly, and for the record, I would actually be in favor of having the first 
step on this being the addition of a very simple 'raw' notation to access 
composites. Something that could look like the example in the attached file 
'raw_composite.txt' (put separatly because this comment is way too long 
already). The advantages being that: it's super simple to do, it'll be natural 
for users coming from thrift and it'll have not specific limitation (in 
particular it'll handle dynamic composites). Then, a second step would be to 
add more limited but more user-friendly notation to deal with specific cases 
(like the transposed and the sparse thing).
                
> CQL support for compound columns
> --------------------------------
>
>                 Key: CASSANDRA-2474
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Eric Evans
>            Assignee: Pavel Yaskevich
>              Labels: cql
>             Fix For: 1.1
>
>         Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
> 2474-transposed-select.PNG, raw_composite.txt, screenshot-1.jpg, 
> screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2474) CQL support for compound columns

Reply via email to