[ 
https://issues.apache.org/jira/browse/CASSANDRA-11000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp resolved CASSANDRA-11000.
--------------------------------------
       Resolution: Won't Fix
    Reproduced In: 3.0.1, 2.2.4  (was: 2.2.4, 3.0.1)

I completely agree with [~spo...@gmail.com].

Writing data with a timestamp in the future is an use case that requires 
knowledge on how C* is designed and works (however, this may also occur if the 
system wall clocks are not in sync - that's why we recommend to ensure that the 
system wall clocks are in sync). And what should C* do if it detects such a 
timestamp in the future? Shall it reject the operation? But what if the system 
wall clock was out of sync and has been adjusted? Is it still a valid operation 
or not? I assume there not one "golden" way that's viable for everybody. I 
think adding such a check to LWT operations makes things even worse.

Also mixing LWT and non-LWT statements is a valid use case - but if mixed on 
the same columns it can cause trouble.

In a perfect world, it might be easily solvable. But the world's not perfect. 
I.e. non-LWT updates can be delayed by LAN or WAN failures, node outages 
(hardware failures, regular maintenance operations, etc). There's just too much 
stuff that needs to be considered.

TL;DR that's why I resolved this as won't fix.

> Mixing LWT and non-LWT operations can result in an LWT operation being 
> acknowledged but not applied
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11000
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11000
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination
>         Environment: Cassandra 2.1, 2.2, and 3.0 on Linux and OS X.
>            Reporter: Sebastian Marsching
>
> When mixing light-weight transaction (LWT, a.k.a. compare-and-set, 
> conditional update) operations with regular operations, it can happen that an 
> LWT operation is acknowledged (applied = True), even though the update has 
> not been applied and a SELECT operation still returns the old data.
> For example, consider the following table:
> {code}
> CREATE TABLE test (
>     pk text,
>     ck text,
>     v text,
>     PRIMARY KEY (pk, ck)
> );
> {code}
> We start with an empty table and insert data using a regular (non-LWT) 
> operation:
> {code}
> INSERT INTO test (pk, ck, v) VALUES ('foo', 'bar', '123');
> {code}
> A following SELECT statement returns the data as expected. Now we do a 
> conditional update (LWT):
> {code}
> UPDATE test SET v = '456' WHERE pk = 'foo' AND ck = 'bar' IF v = '123';
> {code}
> As expected, the update is applied and a following SELECT statement shows the 
> updated value.
> Now we do the same but use a time stamp that is slightly in the future (e.g. 
> a few seconds) for the INSERT statement (obviously $time$ needs to be 
> replaced by a time stamp that is slightly ahead of the system clock).
> {code}
> INSERT INTO test (pk, ck, v) VALUES ('foo', 'bar', '123') USING TIMESTAMP 
> $time$;
> {code}
> Now, running the same UPDATE statement still report success (applied = True). 
> However, a subsequent SELECT yields the old value ('123') instead of the 
> updated value ('456'). Inspecting the time stamp of the value indicates that 
> it has not been replaced (the value from the original INSERT is still in 
> place).
> This behavior is exhibited in an single-node cluster running Cassandra 
> 2.1.11, 2.2.4, and 3.0.1.
> Testing this for a multi-node cluster is a bit more tricky, so I only tested 
> it with Cassandra 2.2.4. Here, I made one of the nodes lack behind in time 
> for a few seconds (using libfaketime). I used a replication factor of three 
> for the test keyspace. In this case, the behavior can be demonstrated even 
> without using an explicitly specified time stamp. Running
> {code}
> INSERT INTO test (pk, ck, v) VALUES ('foo', 'bar', '123');
> {code}
> on a node with the regular clock followed by
> {code}
> UPDATE test SET v = '456' WHERE pk = 'foo' AND ck = 'bar' IF v = '123';
> {code}
> on the node lagging behind results in the UPDATE to report success, but the 
> old value still being used.
> Interestingly, everything works as expected if using LWT operations 
> consistently: When running
> {code}
> UPDATE test SET v = '456' WHERE pk = 'foo' AND ck = 'bar' IF v = '123';
> UPDATE test SET v = '123' WHERE pk = 'foo' AND ck = 'bar' IF v = '456';
> {code}
> in an alternating fashion on two nodes (one with a "normal" clock, one with 
> the clock lagging behind), the updates are applied as expected. When checking 
> the time stamps ("{{SELECT WRITETIME(v) FROM test;}}"), one can see that the 
> time stamp is increased by just a single tick when the statement is executed 
> on the node lagging behind.
> I think that this problem is strongly related to (or maybe even the same as) 
> the one described in CASSANDRA-7801, even though CASSANDRA-7801 was mainly 
> concerned about a single-node cluster. However, the fact that this problem 
> still exists in current versions of Cassandra makes me suspect that either it 
> is a different problem or the original problem was not fixed completely with 
> the patch from CASSANDRA-7801.
> I found CASSANDRA-9655 which suggest removing the changes introduced with 
> CASSANDRA-7801 because they can be problematic under certain circumstances, 
> but I am not sure whether this is the right place to discuss the issue I am 
> experiencing. If you feel so, feel free to close this issue and update the 
> description of CASSANDRA-9655.
> In my opinion, the best way to fix this problem would be ensuring that a 
> write that is part of a LWT always uses a time stamp that is at least one 
> tick greater than the time stamp of the existing data. As the existing data 
> has to be read for checking the condition anyway, I do not think that this 
> would cause an additional overhead. If this is not possible, I suggest to 
> look into whether we can somehow detect such a situation and at least report 
> failure (applied = False) on the LWT instead of reporting success.
> The latter solution would at least fix those cases where code checks the 
> success of a LWT before performing any further actions (e.g. because the LWT 
> is used to take some kind of lock). Currently, the code will assume that the 
> operation was successful (and thus - staying in the example - it owns the 
> lock), while other processes running in parallel will see a different state. 
> It is my understanding that LWTs were designed to avoid exactly this 
> situation, but at the moment the assumptions most users will make about LWTs 
> do not always hold.
> Until this issue is solved, I suggest at least updating the CQL documentation 
> and clearly stating that LWTs / conditional updates are not safe if data has 
> been previously INSERTed / UPDATEd / DELETEd using non-LWT operations and 
> there is a clock skew or time stamps that are in the future have been 
> supplied explicitly. This should at least save some users from making wrong 
> assumptions about LWTs and not realizing it until their application fails in 
> an unsafe way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to