[jira] [Updated] (CASSANDRA-10085) Allow BATCH with conditions to span multiple tables with same partition key

C. Scott Andreas (JIRA) Sun, 18 Nov 2018 21:42:30 -0800


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


C. Scott Andreas updated CASSANDRA-10085:
-----------------------------------------
    Component/s: CQL

> Allow BATCH with conditions to span multiple tables with same partition key
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10085
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10085
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Dennis Roppelt
>            Priority: Major
>              Labels: batch, conditional-statement, partitioning
>
> h4. Use case:
> I want to use batch to keep data between tables synchronized with help of 
> conditionals. In particular, I want to have a ONE-to-ONE relationship in my 
> data model. I dont want to have an unintended upsert while trying to maintain 
> a consistent one-to-one relationship (write requests from other clients that 
> try to create a different relationship with one of the keys I inteded to 
> use), which is why I would like the IF NOT EXISTS when inserting data into 
> multiple tables in a batch.
> But when trying to insert data in a batch-block with an IF NOT EXISTS, I get 
> the following response (reproduceable example further below):
> {code}
> cqlsh:testkeyspace> 
> BEGIN BATCH 
> INSERT INTO key1_to_key2 (partKey, key1, key2) values (1,2,3) IF NOT EXISTS; 
> INSERT INTO key2_to_key1 (partKey, key1, key2) values (1,2,3) IF NOT EXISTS; 
> APPLY BATCH;
> Bad Request: Batch with conditions cannot span multiple tables{code}
> The tables have a primary key on (partKey, key1) and (partKey, key2). Which 
> means that partKey in both cases is their partition key. Which in the example 
> also is the same value.
> h4. Why I want to use a BATCH-statement that way:
> In traditional databases, to design a one-to-one relationship, I would create 
> one table with two keys, both with a unique constraint:
> {code:sql}
> CREATE TABLE  `myOneToOneTable` (
>   `ID` int NOT NULL,
>   `KEY_ONE` int NOT NULL,
>   `KEY_TWO` int NOT NULL,
> // some other fields
>   PRIMARY KEY (`ID`),
>   UNIQUE KEY `KEY_ONE`,
>   UNIQUE KEY `KEY_TWO`
> );
> {code}
> (simplified example taken from an [one-to-one example for 
> mysql|http://www.mkyong.com/mysql/how-to-define-one-to-one-relationship-in-mysql/])
> As far as I know, unique constraints are not supported in Cassandra, so 
> another way around it would be to maintain the relationshipt between two 
> tables, each sharing the same partition key. For which I would like Cassandra 
> to make sure that when I insert data, either both are inserted at the same 
> time or none (to prevent other writing client to insert a different 
> one-to-one relationship than me). 
> [According to the documentation, this is what BATCH-Statements are used 
> for|http://docs.datastax.com/en/cql/3.3/cql/cql_using/useBatch.html]
> {quote}
> Instead, using batches to synchronize data to tables is a legitimate 
> operation. 
> {quote}
> I expected it to be able to do it, but as seen above, I cannot enforce INSERT 
> IF NOT EXISTS over multiple tables.
> I do understand the background as to *why* there is a limitation coded that 
> way, so that not too many nodes have to take part in processing of the batch. 
> However, if the operations use
> # primary keys in their statements for each table
> # all primary keys contain the same value, thus resulting in the same hash 
> for partioning
> # tables belong to the same keyspace
> the batch statement should be able to be limited to the same amount of nodes 
> as if the batch contained INSERT IF NOT EXISTs for only one table. 
> h4. Steps to reproduce desired behaviour:
> I used the [cassandra ova on virtual 
> box|http://www.planetcassandra.org/install-cassandra-ova-on-virtualbox/], but 
> also a Windows Setup with 2.2.0
> {code}
> Connected to Cluster on a Stick at localhost:9160.                            
>   
> [cqlsh 4.1.1 | Cassandra 2.0.7 | CQL spec 3.1.1 | Thrift protocol 19.39.0]    
>   
> Use HELP for help.                                                            
>   
> cqlsh> CREATE KEYSPACE conditionalBatch                                       
>   
> WITH REPLICATION = {'class':'SimpleStrategy','replication_factor':1};    
> cqlsh> use conditionalbatch ;                                                 
>   
> cqlsh:conditionalbatch> CREATE TABLE key1_to_key2 (                           
>   
> partKey int,                                            
> key1 int,                                               
> key2 int,                                               
> PRIMARY KEY (partKey, key1)                             
> );                                                      
> cqlsh:conditionalbatch> CREATE TABLE key2_to_key1 ( partKey int, key2 int, 
> key1 
> int, PRIMARY KEY (partKey, key2) 
> );                                             
> cqlsh:conditionalbatch> BEGIN BATCH                                           
>   
> INSERT INTO key1_to_key2 (partKey, key1, key2) VALUES (1,2,3) IF NOT EXISTS;  
>                                                           
> INSERT INTO key2_to_key1 (partKey, Key1, key2) VALUES (1,2,3) IF NOT EXISTS;  
>                                                           
> APPLY BATCH;                                            
> {code}
> h4. What should happen:
> Either succeding, or:
> {code}
>  applied | partkey | key1 | key2|                                             
>  
> -----------+---------+------+------                                           
>   
>      false |       1 |    2 |    3                                            
>   
>      false |       1 |    2 |    3   
> {code}
> h4. What happens instead:
> {code}
> Bad Request: Batch with conditions cannot span multiple tables 
> {code}
> In my opinion, this is a very crucial feature wich enables the users to 
> maintain unique constraints with a relatively small amount of work. There 
> certainly are other ways to maintain unique constrains, but I have not found 
> a trivial way within cassandra that lets me that easily. But maybe I have 
> overlooked something though.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-10085) Allow BATCH with conditions to span multiple tables with same partition key

Reply via email to