Re: Read/Write consistency issue

Tupshin Harper Fri, 10 Jan 2014 15:55:57 -0800

That really should work, unless I'm missing something. If you retry your
test  with either 1.2.13 or 2.0.4 (as opposed to earlier releases of either
branch), and triple check your observations to make sure that your single
threaded code is doing what you think it is, and still see the behaviour, I
would want to investigate this much more deeply.


-Tupshin


On Fri, Jan 10, 2014 at 6:29 PM, Manoj Khangaonkar <khangaon...@gmail.com>wrote:

> Thanks all for the response. I will change to keeping writes idempotent
> and aggregate at a later stage.
>
> But considering my read , write , read operations are sequential and from
> the same thread and with Consistency ALL,
> the write should not return until all replicas have committed. So I am
> expecting all replicas to have the same value, when the next read happens.
> Not true ??
>
> regards
>
>
> On Fri, Jan 10, 2014 at 2:51 PM, Tupshin Harper <tups...@tupshin.com>wrote:
>
>> Yes this is pretty close to the ultimate anti-pattern in Cassandra.
>> Whenever possible, we encourage models where your updates are idempotent,
>> and not dependent on a read before write. Manoj is looking for what is
>> essentially strong ordering in a distributed system, which always has
>> inherent trade-offs.
>>
>> CAS (lightweight transactions) in 2.0 might actually be usable for this,
>> but it will badly hurt your performance, and not recommended.
>>
>> 2.1 counters (major counter rewrite) are actually very likely to be a
>> great fit for this, but they still won't have TTL. That, however, could
>> easily be worked around, IMO. It would just require a bit of housekeeping
>> to keep track of your counters and lazily delete them.
>>
>> But yes, I third Robert's suggestion of aggregate on read instead of
>> write.
>>
>> -Tupshin
>>
>>
>> On Fri, Jan 10, 2014 at 5:41 PM, Steven A Robenalt <srobe...@stanford.edu
>> > wrote:
>>
>>> My understanding is that it's generally a Cassandra anti-pattern to do
>>> read-before-write in any case, not just because of this issue. I'd agree
>>> with Robert's suggestion earlier in this thread of writing each update
>>> independently and aggregating on read.
>>>
>>> Steve
>>>
>>>
>>>
>>> On Fri, Jan 10, 2014 at 2:35 PM, Robert Wille <rwi...@fold3.com> wrote:
>>>
>>>> Actually, locking won’t fix the problem. He’s getting the problem on a
>>>> single thread.
>>>>
>>>> I’m pretty sure that if updates can occur within the same millisecond
>>>> (or more, if there is clock skew), there is literally nothing you can do to
>>>> make this pattern work.
>>>>
>>>> Robert
>>>>
>>>> From: Todd Carrico <todd.carr...@match.com>
>>>> Reply-To: <user@cassandra.apache.org>
>>>> Date: Friday, January 10, 2014 at 3:28 PM
>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>> Subject: RE: Read/Write consistency issue
>>>>
>>>> That, or roll your own locking.  Means multiple updates, but it works
>>>> reliably.
>>>>
>>>>
>>>>
>>>> tc
>>>>
>>>>
>>>>
>>>> *From:* Robert Wille [mailto:rwi...@fold3.com <rwi...@fold3.com>]
>>>> *Sent:* Friday, January 10, 2014 4:25 PM
>>>> *To:* user@cassandra.apache.org
>>>> *Subject:* Re: Read/Write consistency issue
>>>>
>>>>
>>>>
>>>> Cassandra is a last-write wins kind of a deal. The last write is
>>>> determined by the timestamp. There are two problems with this:
>>>>
>>>>    1. If your clocks are not synchronized, you’re totally screwed.
>>>>    Note that the 2nd and 3rd to last operations occurred just 2 
>>>> milliseconds
>>>>    apart. A clock skew of 2 milliseconds would definitely manifest itself 
>>>> like
>>>>    that.
>>>>    2. Even if your clocks are perfectly synchronized, timestamps only
>>>>    have millisecond granularity. If multiple writes occur within the same
>>>>    millisecond, its impossible for Cassandra to determine which one 
>>>> occurred
>>>>    last.
>>>>
>>>> Lots of really good information here:
>>>> http://aphyr.com/posts/294-call-me-maybe-cassandra/
>>>>
>>>>
>>>>
>>>> I’d be very interested in hearing what others have to say. In the
>>>> article I just linked to, the author experienced similar problems, even
>>>> with “perfectly synchronized clocks”, whatever that means.
>>>>
>>>>
>>>>
>>>> The conclusion I’ve arrived at after reading and pondering is that if
>>>> you perform multiple updates to a cell, even with synchronous calls from a
>>>> single-threaded app, if those updates occur less than a millisecond apart,
>>>> or approach the sum of the clock drift and network latency, you’re probably
>>>> hosed.
>>>>
>>>>
>>>>
>>>> I think a better approach for Cassandra would be to write new values
>>>> each time, and then sum them up on read, or perhaps have a process that
>>>> periodically aggregates them. It’s a tricky business for sure, not one that
>>>> Cassandra is very well equipped to handle.
>>>>
>>>>
>>>>
>>>> Robert
>>>>
>>>>
>>>>
>>>> *From: *Manoj Khangaonkar <khangaon...@gmail.com>
>>>> *Reply-To: *<user@cassandra.apache.org>
>>>> *Date: *Friday, January 10, 2014 at 2:50 PM
>>>> *To: *<user@cassandra.apache.org>
>>>> *Subject: *Read/Write consistency issue
>>>>
>>>>
>>>>
>>>> Hi
>>>>
>>>>
>>>>
>>>> Using Cassandra 2.0.0.
>>>>
>>>> 3 node cluster
>>>>
>>>> Replication 2.
>>>>
>>>> Using consistency ALL for both read and writes.
>>>>
>>>>
>>>>
>>>> I have a single thread that reads a value, updates it and writes it
>>>> back to the table. The column type is big int. Updating counts for a
>>>> timestamp.
>>>>
>>>>
>>>>
>>>> With single thread and consistency ALL , I expect no lost updates. But
>>>> as seem from my application log below,
>>>>
>>>>
>>>>
>>>> 10 07:01:58,507 [Thread-10] BeaconCountersCAS2DAO [INFO] 1389366000  H
>>>>  old=59614 val =252 new =59866
>>>>
>>>> 10 07:01:58,611 [Thread-10] BeaconCountersCAS2DAO [INFO] 1389366000  H
>>>>  old=59866 val =252 new =60118
>>>>
>>>> 10 07:01:59,136 [Thread-10] BeaconCountersCAS2DAO [INFO] 1389366000  H
>>>>  old=60118 val =255 new =60373
>>>>
>>>> 10 07:02:00,242 [Thread-10] BeaconCountersCAS2DAO [INFO] 1389366000  H
>>>>  old=60373 val =243 new =60616
>>>>
>>>> 10 07:02:00,244 [Thread-10] BeaconCountersCAS2DAO [INFO] 1389366000  H
>>>>  old=60616 val =19 new =60635
>>>>
>>>> 10 07:02:00,326 [Thread-10] BeaconCountersCAS2DAO [INFO] 1389366000  H
>>>>  old=60616 val =233 new =60849
>>>>
>>>>
>>>>
>>>> See the last 2 lines of above log.
>>>>
>>>> value 60116 is updated to 60635. but the next operation reads the old
>>>> value 60616 again.
>>>>
>>>>
>>>>
>>>> I am not using counter column type because it does not support TTL and
>>>> i hear there are lot of open issues with counters.
>>>>
>>>>
>>>>
>>>> Is there anything else I can do to further tighten the consistency or
>>>> is this pattern of high volume read - update - write not going to work in
>>>> C* ?
>>>>
>>>>
>>>>
>>>> regards
>>>>
>>>> MJ
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>
>>>
>>>
>>> --
>>> Steve Robenalt
>>> Software Architect
>>>  HighWire | Stanford University
>>> 425 Broadway St, Redwood City, CA 94063
>>>
>>> srobe...@stanford.edu
>>> http://highwire.stanford.edu
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
> --
> http://khangaonkar.blogspot.com/
>

Re: Read/Write consistency issue

Reply via email to