Re: Atomic Compare and Swap

aaron morton Wed, 23 Jun 2010 04:56:48 -0700

I've been playing with something like CAS, it's not the same but itmay be of interest.

I write some data into Cassandra with quorum or better consistency,that allows me to assert what it should look like when read back. Ifthe assertion holds I can then go ahead.

For example, in a CF with Time uuid ordering the client writes acolumn against the key of the thing we want to update. This write doesnot store the value. Then read back the first ordered column, if it'sname is my uuid then I can proceed. Otherwise delete the column. Ifyou know the uuid of the last update you can read back two columns.Then assert your the first and the previous is the second.

Perhaps if you were doing a CAS you could then write then actual valueyou want to update and somehow store the uuid from above with it. Sayas col in another col family with name as the uuid and value as thevalue. To read get the first colum from both CFs as a multi get, thecol names must match from both cols for the value to be correct.


(could just use two diff keys in same CF)

Hope that makes sense.
Aaron






On 23/06/2010, at 4:27 PM, Mike Malone <[email protected]> wrote:

I'd be interested in what the folks who want CAS implementationsthink aboutvector clocks. Can you use them to fulfill your use cases? If not,why not?
I ask because I have found myself wanting CAS in Cassandra too, butI thinkthat's only because I'm pretty familiar with HTTP. I think vectorclockswith client merge give you essentially the same functionality, butin a waythat fits much more nicely with the rest of the Cassandraarchitecture. CAS
really exacerbates Cassandra's weaknesses.

Mike
On Tue, Jun 22, 2010 at 4:52 PM, Rishi Bhardwaj<[email protected]>wrote:
S>: An *atomic* CAS is another beast and I see at least twodifficulties:
S>: 1) making it atomic locally: Cassandra's implementation is verymuch
multi-threaded. On a given node, while you're
reading-comparing-and-swapping
on some column c, no other thread should be allowed to write c (even
'normal'
write). You would probably need to have specific column familieswhere CAS
is
allowed and for which all writes would be slower (since somelocking would
be
involved). Even then, making such locking efficient and right isnot easy.
But
in the end, local atomicity is quite probably the easy part.
R: I am curious as to how does Cassandra handle two concurrentwrites to
the same column right now? Is there any locking on the write path to
serialize two writes to the same column? If there is any lockingthen CAScan build on that. If there is no such locking then we couldexclude normalwrites from the synchronization/locking required for CAS. So thenormalwrite path remains the same, and we let the client know that atomicCASwouldn't work if normal writes are also happening on the samecolumn values.In short a client should not mix normal writes with Atomic CAS forwriting
some column value. This will hopefully make things simpler.
S:>2) making it atomic cluster-wide: data is replicated and anatomic CAS
would
need to apply on the exact same column version in every node.Which, with
eventual consistency especially, is pretty hard to accomplish unless
you're
locking the cluster (but that's what Cages/ZK do).
R: For starters it would be great if atomic CAS could work forconsistencylevel Quorum and ALL and not be supported for other consistencylevels. Evenfor other consistency levels what would stop CAS to work? Why wouldonerequire cluster wide locking? I might be mistaken here but theatomic CAS
operation would happen individually at all the replica nodes (either
directly or through hinted writes) and would succeed or faildepending onthe timestamp/version of the column at the replica. If we do Quorumreads
and CAS writes then we can also be sure about consistency.

S:>That being said, if you have a neat solution for efficient and
distributed
atomic CAS that doesn't require rewriting 80% of Cassandra, I'msure there
will be interest in that.
R: That sounds great. I am definitely going to look into this andreport
back if I have a good solution.


Thanks,
Rishi




________________________________
From: Sylvain Lebresne <[email protected]>
To: [email protected]
Sent: Tue, June 22, 2010 1:21:51 AM
Subject: Re: Atomic Compare and Swap
On Mon, Jun 21, 2010 at 11:19 PM, Rishi Bhardwaj <[email protected]>
wrote:
I have read the post on cages and it is definitely veryinteresting. Butcages seems to be too coarse grained compared to an Atomic Compareand
Swap
on Cassandra column value. Cages would makes sense when one wantsto domultiple atomic row, column updates. Also, I am not so sure aboutthe
scalability when it comes to using zookeeper for keeping locks on
Cassandra
columns... there would also be performance hit with an added RPC for
every
write. I feel Cages maybe fine for systems when one has few locksbut I
feel
an atomic CAS in Cassandra would help us avoid distributed locking
systems
and zookeeper in many other simpler scenarios. For more complicated
(transaction like) things, using Cages may be fine. Then againdoing a
read
before write for CAS in cassandra will make CAS at least as slowas a
read,
which I believe will still be better than taking a single columnlock
from
zookeeper.
What do other folks think in this regard? From whatever I haveread, Ibelieve CAS is feasible in Cassandra without hurting the normalwrite
path
performance. Only for CAS writes would we have to pay for the readbeforewrite penalty. I am going to do feasibility study for this andwould love
any pointers from others about this.
Making a (non atomic) CAS is easy (doing it client side is fine,and therehas been some discussion about 'callbacks' that may or may notsomeday
allow
to do that server-side).

An *atomic* CAS is another beast and I see at least two difficulties:

1) making it atomic locally: Cassandra's implementation is very much
multi-threaded. On a given node, while you're
reading-comparing-and-swapping
on some column c, no other thread should be allowed to write c (even
'normal'
write). You would probably need to have specific column familieswhere CAS
is
allowed and for which all writes would be slower (since somelocking would
be
involved). Even then, making such locking efficient and right isnot easy.
But
in the end, local atomicity is quite probably the easy part.
2) making it atomic cluster-wide: data is replicated and an atomicCAS
would
need to apply on the exact same column version in every node.Which, witheventual consistency especially, is pretty hard to accomplishunless you're
locking the cluster (but that's what Cages/ZK do).
That being said, if you have a neat solution for efficient anddistributedatomic CAS that doesn't require rewriting 80% of Cassandra, I'msure there
will be interest in that.

--
Sylvain
Thanks,
Rishi



________________________________
From: Rauan Maemirov <[email protected]>
To: [email protected]
Sent: Mon, June 21, 2010 11:27:02 AM
Subject: Re: Atomic Compare and Swap

Have you read this post?
http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
I guess, you will like it.

2010/6/22 Rishi Bhardwaj <[email protected]>
I am definitely interested in taking this work up. I believe theCASfunctionality would help in a lot of different scenarios andcould helpavoid use of other external services (like zookeeper) to providesimilar
functionality. I am new at Cassandra development and would really
appreciate
pointers from the dev. community about how to approach/start onthisproject. Also how feasible is the approach mentioned below toimplement
the
CAS functionality? It would be great if we could have adiscussion on
the
pros and cons.

Thanks,
Rishi



________________________________
From: Sriram Srinivasan <[email protected]>
To: [email protected]
Sent: Sun, June 20, 2010 9:47:37 PM
Subject: Re: Atomic Compare and Swap


I too am interested in a CAS facility.
I like Rishi's proposal. One could simply use a version number asthelogical timestamp. If we promote CAS to a consistency level, itwould
rate
higher than a quorum. One pays the price for a more complex writepath
to
obtain the requisite guarantee.


On Jun 21, 2010, at 4:03 AM, Rishi Bhardwaj wrote:
Heres another thought I had, if say the user always wrote withquorum
(or
to all) nodes then can't we implement CAS (compare and swap)assuming
that
user employs logical timestamp and Cassandra doesn't allow writesto acolumn with same or older timestamp. Here's the scenario I amthinking
about:
Say we use logical timestamp for a column value and lets assumethe
current timestamp is t. Now say two clients read this column and
generate
concurrent CAS (compare and swap) operations on timestamp t andfor both
the
writes the resulting new timestamp would become (t+1). Now if wedon't
allow
writes to a column with same timestamp then only one of thesewrites
would
succeed. Of course another assumption is that if a third CASwrite withcompare on logical timestamp (t - 1) came in, that would bedenied as I
believe Cassandra doesn't allow "older" writes to win over "newer"
writes.
Do you think such a thing can be accomplished?

Re: Atomic Compare and Swap

Reply via email to