[jira] [Commented] (CASSANDRA-8649) CAS per (partition key + clustering key) and not only per (partition key)

Dobrin (JIRA) Sun, 25 Jan 2015 07:10:20 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291128#comment-14291128
 ]


Dobrin commented on CASSANDRA-8649:
-----------------------------------

Hi Jonathan,

Thanks for you reply and for closing as "Later". 

I partly agree with you. I understand why we do not want LWTs to span multiple 
units of replication (it leads to coordination among different replica sets). 
In other words LWTs cannot be more coarse grained then the unit of replication.
...
And I'm wondering why LWTs should be as coarse grained as the unit of 
replication?
I think that LWTs can be finer grained and that this is closer to real life 
use-cases (unit of replication is 2 billion rows).

Also note that currently (as illustrated in the example above) we should choose 
either ordering via clustering key or concurrent LWTs - we cannot have both 
(not counting the workaround).

Thanks,
Dobrin


> CAS per (partition key + clustering key) and not only per (partition key)
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8649
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8649
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Dobrin
>             Fix For: 3.0
>
>
> Reading the description at 
> http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0 :
> ...
> * The columns updated do NOT have to be the same as the columns in the IF 
> clause.
> * Lightweight transactions are restricted to a single partition; this is the 
> granularity at which we keep the internal Paxos state. As a corollary, 
> transactions in different partitions will never interrupt each other.
> ...
> So my understanding of the above is that if multiple writers for example 
> perform CAS inserts (INSERT...IF NOT EXISTS) using the same partition key and 
> different clustering keys will interrupt/interfere with each other?
> Is this understanding correct? (my tests seems to confirm it)
> For example if I want to model users from different country/city/area and I 
> want to be able to list all the users from a given country ordered by 
> (city,area) and also I know that a single cassandra node will be able to 
> store all the users from a given country but I need to partition users from 
> different countries because a single cassandra node will not be enough:
> CREATE COLUMNFAMILY user (
>       country text,
>       city text,
>       area text,
>       id text,
>       json text,
>       version bigint,
>       PRIMARY KEY ((country), city, area, id)
> );
> Where id is the user id and json is a JSON serialized user object (an 
> aggregate) containing more information about the user. 
> I want to be able to CAS insert many users into the same country concurrently 
> using
>       INSERT INTO user(country, city, area, id, json, version) VALUES 
> ('x',...) IF NOT EXISTS;
>       
> and be able to CAS update users from the same country concurrently:
>       UPDATE user SET json='{...}',version=18 WHERE country='x' AND city='y' 
> AND area='z' AND id='123' IF version=17;
>       
> As I understand this will not be efficient because all the above concurrent 
> statements will have to be "ordered" by the same paxos instance/state per 
> country 'x'? (and trying it results in a lot of WriteTimeoutException-s)
> If yes - can we made paxos to support IF statements per column/cell?
> By cell/column I mean all the underlying persistent state that is behind the 
> compound primary key (partition key + clustering key) - in the above example
>       the state is json and version
>       the partition key is the country
>       and the clustering key is (city, area, id)
> (     
> I'm stating it explicitly as I'm not completely sure whether this is a single 
> cell or double cells underneath at the storage engine, references used:
> http://www.datastax.com/dev/blog/cql3-for-cassandra-experts
> http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
> )
> In other words is it possible to make CAS per (partition key + clustering 
> key) and not only per (partition key)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8649) CAS per (partition key + clustering key) and not only per (partition key)

Reply via email to