[jira] [Reopened] (CASSANDRA-7850) Composite Aware Partitioner

Drew Kutcharian (JIRA) Fri, 29 Aug 2014 17:42:07 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Drew Kutcharian reopened CASSANDRA-7850:
----------------------------------------


Hi [~jbellis], I think you misunderstood this JIRA or more likely I didn't 
explain it properly.

In the link that you provided:

bq. Generally, Cassandra will store columns having the same block_id but a 
different breed on different nodes, and columns having the same block_id and 
breed on the same node.

The point of this JIRA is to be able to store columns having the _same_ 
block_id but different breeds on the same node. (Think wide row sharding)

> Composite Aware Partitioner
> ---------------------------
>
>                 Key: CASSANDRA-7850
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7850
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Drew Kutcharian
>
> Since C* supports composites for partition keys, I think it'd be useful to 
> have the ability to only use first (or first few) components of the key to 
> calculate the token hash.
> A naive use case would be multi-tenancy:
> Say we have accounts and accounts have users. So we would have the following 
> tables:
> {code}
> CREATE TABLE account (
>   id                     timeuuid PRIMARY KEY,
>   company         text
> );
> {code}
> {code}
> CREATE TABLE user (
>   id              timeuuid PRIMARY KEY, 
>   accountId timeuuid,
>   email        text,
>   password text
> );
> {code}
> {code}
> // Get users by account
> CREATE TABLE user_account_index (
>   accountId  timeuuid,
>   userId        timeuuid,
>   PRIMARY KEY(acid, id)
> );
> {code}
> Say we want to get all the users that belong to an account. We would first 
> have to get the results from user_account_index and then use a multi-get 
> (WHERE IN) to get the records from user table. Now this multi-get part could 
> potentially query a lot of different nodes in the cluster. It’d be great if 
> there was a way to limit storage of users of an account to a single node so 
> that way multi-get would only need to query a single node.
> With this improvement we would be able to define the user table like so:
> {code}
> CREATE TABLE user (
>   id              timeuuid, 
>   accountId timeuuid,
>   email        text,
>   password text,
>   PRIMARY KEY(((accountId),id))  //extra parentheses
> );
> {code}
> I'm not too sure about the notation, it could be something like PRIMARY 
> KEY(((accountId),id)) where the "(accountId)" means use this part to 
> calculate the hash and ((accountId),id) is the actual partition key.
> The main complication I see with this is that we would have to use the table 
> definition when calculating hashes so we know what components of the 
> partition keys need to be used for hash calculation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Reopened] (CASSANDRA-7850) Composite Aware Partitioner

Reply via email to