Re: Query Consistency Issues...

Steve Robenalt Tue, 15 Dec 2015 13:35:51 -0800

I agree with Jon. It's almost a statistical certainty that such updates
will be processed out of order some of the time because the clock sync
between machines will never be perfect.


Depending on how your actual code that shows this problem is structured,
there are ways to reduce or eliminate such issues. If the successive
updates are always expected to occur together in a specific order, you can
wrap them in a BatchStatement, which forces them to use the same
coordinator node and thus preserves the ordering of the updates. If there
is a causal relationship driving the order of the updates, a Light Weight
Transaction might be appropriate. Another strategy is to publish an event
to a topic after the first update and a subscriber can then trigger the
second.

There are other options, but I've used the above 3 to solve this problem
whenever I've encountered this situation and haven't found a case where I
needed another.

HTH,
Steve

On Tue, Dec 15, 2015 at 12:56 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> High volume updates to a single key in a distributed system that relies on
> a timestamp for conflict resolution is not a particularly great idea.  If
> you ever do this from multiple clients you'll find unexpected results at
> least some of the time.
>
> On Tue, Dec 15, 2015 at 12:41 PM Paulo Motta <pauloricard...@gmail.com>
> wrote:
>
>> > We are using 2.1.7.1
>>
>> Then you should be able to use the java driver timestamp generators.
>>
>> > So, we need to look for clock sync issues between nodes in our ring?
>> How close do they need to be?
>>
>> millisecond precision since that is the server precision for timestamps,
>> so probably NTP should do the job. if your application have submillisecond
>> updates in the same partitions, you'd probably need to use client-side
>> timestamps anyway, since they allow setting timestamps with sub-ms
>> precision.
>>
>> > Very cool!  If we have multiple nodes in our application, I suppose
>> *their* clocks will have to be sync'ed for this to work, right?
>>
>> correct, you may also use ntp to synchronize clocks between clients.
>>
>>
>> 2015-12-15 12:19 GMT-08:00 James Carman <ja...@carmanconsulting.com>:
>>
>>>
>>>
>>> On Tue, Dec 15, 2015 at 2:57 PM Paulo Motta <pauloricard...@gmail.com>
>>> wrote:
>>>
>>>> What cassandra and driver versions are you running?
>>>>
>>>>
>>> We are using 2.1.7.1
>>>
>>>
>>>> It may be that the second update is getting the same timestamp as the
>>>> first, or even a lower timestamp if it's being processed by another server
>>>> with unsynced clock, so that update may be getting lost.
>>>>
>>>>
>>> So, we need to look for clock sync issues between nodes in our ring?
>>> How close do they need to be?
>>>
>>>
>>>> If you have high frequency updates in the same partition from the same
>>>> client you should probably use client-side timestamps with a configured
>>>> timestamp generator on the driver, available in Cassandra 2.1 and Java
>>>> driver 2.1.2, and default in java driver 3.0.
>>>>
>>>>
>>> Very cool!  If we have multiple nodes in our application, I suppose
>>> *their* clocks will have to be sync'ed for this to work, right?
>>>
>>
>>


-- 
Steve Robenalt
Software Architect
sroben...@highwire.org <bza...@highwire.org>
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication

Re: Query Consistency Issues...

Reply via email to