[jira] [Updated] (CASSANDRA-14323) Same timestamp insert conflict resolution breaks row-level data consistency

2018-03-19 Thread Rishi Kathera (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rishi Kathera updated CASSANDRA-14323:
--
Description: 
When inserting multiple rows with the same primary key and timestamp, memtable 
update logic does not maintain row-level consistency for the key inserted. For 
example,
{code:java}
create table test.consistency(pk int PRIMARY KEY , nk1 text, nk2 text);
BEGIN UNLOGGED BATCH USING TIMESTAMP 1521080773000 
insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2'); 
insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1'); 
APPLY BATCH; 
select * from test.consistency;
{code}
In this case, I would expect either one row overwrites the other so the result 
of the read would be either
{code:java}
2, nk1, nk2{code}
or
{code:java}
2, nk2, nk1{code}
but the row retrieved is
{code:java}
2, nk2, nk2{code}
 which breaks consistency of the writes. This behavior comes from this logic, 

[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Conflicts.java#L45]

where it appears that the value of the cell itself is used to resolve overwrite 
conflict which I don't think is the correct way of handling the situation. 
Shouldn't it either be overwrite or not overwrite for all cases?

  was:
When inserting multiple rows with the same primary key and timestamp, memtable 
update logic does not maintain row-level consistency for the key inserted. For 
example,
{code:java}
create table test.consistency(pk int PRIMARY KEY , nk1 text, nk2 text);
BEGIN UNLOGGED BATCH USING TIMESTAMP 1521080773000 
insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2'); 
insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1'); 
APPLY BATCH; 
select * from test.consistency;
{code}
In this case, I would expect either one row overwrites the other so the result 
of the read would be either
{code:java}
2, nk1, nk2{code}
or
{code:java}
2, nk2, nk1{code}
but the row retrieved is
{code:java}
2, nk2, nk2{code}
 which breaks consistency of the writes. This behavior comes from this logic, 

[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Conflicts.java#L45]

where it appears that the value of the cell itself if used to resolve overwrite 
conflict which I don't think is the correct way of handling the situation. 
Shouldn't it either be overwrite or not overwrite for all cases?


> Same timestamp insert conflict resolution breaks row-level data consistency
> ---
>
> Key: CASSANDRA-14323
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14323
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Rishi Kathera
>Priority: Minor
>
> When inserting multiple rows with the same primary key and timestamp, 
> memtable update logic does not maintain row-level consistency for the key 
> inserted. For example,
> {code:java}
> create table test.consistency(pk int PRIMARY KEY , nk1 text, nk2 text);
> BEGIN UNLOGGED BATCH USING TIMESTAMP 1521080773000 
> insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2'); 
> insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1'); 
> APPLY BATCH; 
> select * from test.consistency;
> {code}
> In this case, I would expect either one row overwrites the other so the 
> result of the read would be either
> {code:java}
> 2, nk1, nk2{code}
> or
> {code:java}
> 2, nk2, nk1{code}
> but the row retrieved is
> {code:java}
> 2, nk2, nk2{code}
>  which breaks consistency of the writes. This behavior comes from this logic, 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Conflicts.java#L45]
> where it appears that the value of the cell itself is used to resolve 
> overwrite conflict which I don't think is the correct way of handling the 
> situation. Shouldn't it either be overwrite or not overwrite for all cases?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14323) Same timestamp insert conflict resolution breaks row-level data consistency

2018-03-19 Thread Rishi Kathera (JIRA)
Rishi Kathera created CASSANDRA-14323:
-

 Summary: Same timestamp insert conflict resolution breaks 
row-level data consistency
 Key: CASSANDRA-14323
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14323
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Rishi Kathera


When inserting multiple rows with the same primary key and timestamp, memtable 
update logic does not maintain row-level consistency for the key inserted. For 
example,
{code:java}
create table test.consistency(pk int PRIMARY KEY , nk1 text, nk2 text);
BEGIN UNLOGGED BATCH USING TIMESTAMP 1521080773000 
insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2'); 
insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1'); 
APPLY BATCH; 
select * from test.consistency;
{code}
In this case, I would expect either one row overwrites the other so the result 
of the read would be either
{code:java}
2, nk1, nk2{code}
or
{code:java}
2, nk2, nk1{code}
but the row retrieved is
{code:java}
2, nk2, nk2{code}
 which breaks consistency of the writes. This behavior comes from this logic, 

[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Conflicts.java#L45]

where it appears that the value of the cell itself if used to resolve overwrite 
conflict which I don't think is the correct way of handling the situation. 
Shouldn't it either be overwrite or not overwrite for all cases?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org