Re: Batch : Isolation and Atomicity for same partition on multiple table

Jeff Jirsa Thu, 14 Dec 2017 22:13:14 -0800

Again, a lot of potential problems can be solved with data modeling - in 
particular consider things like conditional batches where the condition is on a 
static cell/column and writes go to different CQL rows.


-- 
Jeff Jirsa


> On Dec 14, 2017, at 9:57 PM, Mickael Delanoë <[email protected]> wrote:
> 
> Thanks Jeff, 
> I am a little disappointed when you said the guarantee are even weeker.But I 
> will take a look on this and try to understand what is really done.
> 
> 
> 
> Le 13 déc. 2017 18:18, "Jeff Jirsa" <[email protected]> a écrit :
> Entry point is here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java#L346
>  , which will call through to 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageProxy.java#L938-L953
> 
> I believe the guarantees are weaker than the blog suggests, but it's nuanced, 
> and a lot of these types of questions come down to data model (you can model 
> it in a way that you can avoid problems with weaknesses in isolation, but 
> that requires a detailed explanation of your use case, etc).
> 
> 
> 
> 
>> On Wed, Dec 13, 2017 at 8:56 AM, Mickael Delanoë <[email protected]> 
>> wrote:
>> Hi Nicolas, 
>> Thanks for you answer. 
>> Is your assumption 100% sure ?
>> Because the few test I did - using nodetools getendpoints - shown that the 
>> data for the two tables when I used the same partition key went to the same 
>> "nodes" . So I would have expected cassandra to be smart enough to apply 
>> them in the memtable in a single operation to achieve the isolation as the 
>> whole batch will be executed on a single node.
>> Does anybody know where I can find, where the batch operations are processed 
>> in the Cassandra source code, so I could check how all this is processed ?
>> 
>> Regards,
>> Mickaël
>> 
>> 
>> 
>> 2017-12-13 11:18 GMT+01:00 Nicolas Guyomar <[email protected]>:
>>> Hi Mickael,
>>> 
>>> Partition are related to the table they exist in, so in your case, you are 
>>> targeting 2 partitions in 2 different tables.
>>> Therefore, IMHO, you will only get atomicity using your batch statement
>>> 
>>>> On 11 December 2017 at 15:59, Mickael Delanoë <[email protected]> wrote:
>>>> Hello,
>>>> 
>>>> I have a question regarding batch isolation and atomicity with query using 
>>>> a same partition key.
>>>> 
>>>> The Datastax documentation says about the batches :
>>>> "Combines multiple DML statements to achieve atomicity and isolation when 
>>>> targeting a single partition or only atomicity when targeting multiple 
>>>> partitions. A batch applies all DMLs within a single partition before the 
>>>> data is available, ensuring atomicity and isolation.""
>>>> 
>>>> But I try to find exactly what can be considered as a "single partition" 
>>>> and I cannot find a clear response yet. The examples and explanations 
>>>> always speak about partition with only one table used inside the batch. My 
>>>> concern is about partition when we use different table in a batch. So I 
>>>> would like some clarification.
>>>> 
>>>> Here is my use case, I have 2 tables with the same partition-key which is 
>>>> "user_id" :
>>>> 
>>>> CREATE TABLE tableA (
>>>>    user_id text, 
>>>>    clustering text, 
>>>>    value text, 
>>>>    PRIMARY KEY (user_id, clustering));
>>>> 
>>>> CREATE TABLE tableB (
>>>>    user_id text, 
>>>>    clustering1 text, 
>>>>    clustering2 text, 
>>>>    value text, 
>>>>    PRIMARY KEY (user_id, clustering1, clustering2));
>>>> 
>>>> If I do a batch query like this : 
>>>> 
>>>> BEGIN BATCH 
>>>> INSERT INTO tableA (user_id, clustering, value) VALUES ('1234', 'c1', 
>>>> 'val1');
>>>> INSERT INTO tableB (user_id, clustering1, clustering1, value) VALUES 
>>>> ('1234', 'cl1', 'cl2', 'avalue');
>>>> APPLY BATCH;
>>>> 
>>>> the DML statements uses the same partition-key, can we say they are 
>>>> targetting the same partition or, as the partition key are for different 
>>>> table, should we consider this is different partition? And so does this 
>>>> batch ensure atomicity and isolation (in the sense described in Datastax 
>>>> doc)? Or only atomicity?
>>>> 
>>>> Thanks for you help, 
>>>> Mickaël Delanoë
>>> 
>> 
>> 
>> 
>> -- 
>> Mickaël Delanoë
> 
>

Re: Batch : Isolation and Atomicity for same partition on multiple table

Reply via email to