[ 
https://issues.apache.org/jira/browse/CASSANDRA-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285383#comment-14285383
 ] 

Dobrin commented on CASSANDRA-8649:
-----------------------------------

Guys, please share some thoughts? Why don't you like it?

\\
I saw with cassandra-cli that a single user CQL row from the above example

{noformat}
 country | city | area | id  | json | version
---------+------+------+-----+------+---------
       x |    y |    z | 123 |   {} |      17
{noformat}

 translates into three columns/cells underneath:
 
 RowKey: x
=> (name=y:z:123:, value=, timestamp=1421827135986000)
=> (name=y:z:123:json, value=7b7d, timestamp=1421827135986000)
=> (name=y:z:123:version, value=0000000000000011, timestamp=1421827135986000)

So you do not like it because of the (not so natural/direct) CQL row mapping to 
the underlying C* data structures?

\\
I can have two billion users in a partition but accessing them via CAS 
statements means they have a common consistency boundary but I nead each user 
to be a consistency boundary in its own as my users share nothing.

\\
I can think of at least two ways of how to fix this with the current CAS 
support. 
First I need a partition per User -> PRIMARY KEY ((country, city, area, id)) in 
order to fix the consistency boundary / concurrent access. Then:

1) Maintain an index CF manually:
CREATE COLUMNFAMILY user_index (
        country text,
        city text,
        area text,
        id text,
        PRIMARY KEY ((country), city, area, id)
);

After each successful CAS insert into the user CF I can do a non-CAS insert 
into user_index CF (QUARUM consistency level).
After (or before) successful CAS delete from the user CF I can do a non-CAS 
delete into user_index CF (QUARUM consistency level).
CAS Updates do not reflect the index.

\\
This works for me but is not that good as CAS per (partition key+clustering 
key) as I need to maintain an additional CF and need not to "forget" to alter 
the index when I do insert or delete into the user CF.

\\
OR

\\
2) Use OrderPreservingPartitioner. The keyspace distribution among nodes will 
be not balanced. 
Maybe I can fix it so that instead of inserting the country I will insert 
hash(country) to make the country distribution even or something like this!? .. 
I have never tried OrderPreservingPartitioner. But as I see the partitioner is 
per cluster and I need it per CF so .. this does not work for me.

\\
Will appreciate any thoughts or more ideas?

> CAS per (partition key + clustering key) and not only per (partition key)
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8649
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8649
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Dobrin
>             Fix For: 3.0
>
>
> Reading the description at 
> http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0 :
> ...
> * The columns updated do NOT have to be the same as the columns in the IF 
> clause.
> * Lightweight transactions are restricted to a single partition; this is the 
> granularity at which we keep the internal Paxos state. As a corollary, 
> transactions in different partitions will never interrupt each other.
> ...
> So my understanding of the above is that if multiple writers for example 
> perform CAS inserts (INSERT...IF NOT EXISTS) using the same partition key and 
> different clustering keys will interrupt/interfere with each other?
> Is this understanding correct? (my tests seems to confirm it)
> For example if I want to model users from different country/city/area and I 
> want to be able to list all the users from a given country ordered by 
> (city,area) and also I know that a single cassandra node will be able to 
> store all the users from a given country but I need to partition users from 
> different countries because a single cassandra node will not be enough:
> CREATE COLUMNFAMILY user (
>       country text,
>       city text,
>       area text,
>       id text,
>       json text,
>       version bigint,
>       PRIMARY KEY ((country), city, area, id)
> );
> Where id is the user id and json is a JSON serialized user object (an 
> aggregate) containing more information about the user. 
> I want to be able to CAS insert many users into the same country concurrently 
> using
>       INSERT INTO user(country, city, area, id, json, version) VALUES 
> ('x',...) IF NOT EXISTS;
>       
> and be able to CAS update users from the same country concurrently:
>       UPDATE user SET json='{...}',version=18 WHERE country='x' AND city='y' 
> AND area='z' AND id='123' IF version=17;
>       
> As I understand this will not be efficient because all the above concurrent 
> statements will have to be "ordered" by the same paxos instance/state per 
> country 'x'? (and trying it results in a lot of WriteTimeoutException-s)
> If yes - can we made paxos to support IF statements per column/cell?
> By cell/column I mean all the underlying persistent state that is behind the 
> compound primary key (partition key + clustering key) - in the above example
>       the state is json and version
>       the partition key is the country
>       and the clustering key is (city, area, id)
> (     
> I'm stating it explicitly as I'm not completely sure whether this is a single 
> cell or double cells underneath at the storage engine, references used:
> http://www.datastax.com/dev/blog/cql3-for-cassandra-experts
> http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
> )
> In other words is it possible to make CAS per (partition key + clustering 
> key) and not only per (partition key)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to