[
https://issues.apache.org/jira/browse/CASSANDRA-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258105#comment-13258105
]
Sylvain Lebresne commented on CASSANDRA-4176:
---------------------------------------------
I meant to open a similar ticket for some time now but forgot. I've actually
created it as CASSANDRA-4179. It is also basically suggesting adding support
for composites in row key. I however decided to open a separate ticket because:
# I didn't meant CASSANDRA-4179 to be specific to sharding specific and in
particular discuss there the question of composite in column values.
# I think that adding a nice syntax for composite in the row key is indeed nice
for sharding very wide rows, but I'm thinking maybe it could be worth going
even further. What I mean here is that sharding a time series is very common so
we could imagine making that sharding more automatic. For instance (and using a
syntax on which I haven't given much though, but reusing one of my syntax
suggestion from CASSANDRA-4179), we could have:
{noformat}
CREATE TABLE timeline (
user_id varchar,
day_of_tweet date AUTO(day(tweet_id)),
tweet_id uuid,
author varchar,
body varchar,
GROUP (user_id, day_of_tweet) as key,
PRIMARY KEY (key, tweet_id)
);
{noformat}
for which the semantic would be that the day_of_tweet would be automatically
calculated from tweet_id.
I'll admit it's a bit specific in a way, and clearly we could say we leave that
to the client, but time series is a very very common use case for Cassandra and
sharding rows is very often needed at some granularity so ...
Anyway, my suggestion would be to keep the 'composites in row key' discussion
in CASSANDRA-4179 and maybe discuss deeper support for row sharding here.
> Support for sharding wide rows in CQL 3.0
> -----------------------------------------
>
> Key: CASSANDRA-4176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4176
> Project: Cassandra
> Issue Type: Sub-task
> Components: API
> Reporter: Nick Bailey
> Fix For: 1.1.1
>
>
> CQL 3.0 currently has support for defining wide rows by declaring a composite
> primary key. For example:
> {noformat}
> CREATE TABLE timeline (
> user_id varchar,
> tweet_id uuid,
> author varchar,
> body varchar,
> PRIMARY KEY (user_id, tweet_id)
> );
> {noformat}
> It would also be useful to manage sharding a wide row through the cql schema.
> This would require being able to split up the actual row key in the schema
> definition. In the above example you might want to make the row key a
> combination of user_id and day_of_tweet, in order to shard timelines by day.
> This might look something like:
> {noformat}
> CREATE TABLE timeline (
> user_id varchar,
> day_of_tweet date,
> tweet_id uuid,
> author varchar,
> body varchar,
> PRIMARY KEY (user_id REQUIRED, day_of_tweet REQUIRED, tweet_id)
> );
> {noformat}
> Thats probably a terrible attempt at how to structure that in CQL. But I
> think I've gotten the point across. I tagged this for cql 3.0, but I'm
> honestly not sure how much work it might be. As far as I know built in
> support for composite keys is limited.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira