Re: Implementing a locking mechanism - RFC

Daniel Godás Fri, 01 Feb 2013 15:22:29 -0800

Alexis,

I did some more research and I will go with ZooKeeper for locking and
cassandra for storage. This way I can use the locks in zookeeper to
delimit transactions in cassandra. When the distributed locking
problem is solved I don't really need a distributed queue but it comes
for free with zookeeper too.


Regards,
Dan

2013/2/1 Alexis Wilke <ale...@m2osw.com>:
> Hi Daniel,
>
> Note that I did implement the Lamport bakery algorithm using Cassandra only
> in the libQtCassandra (C++) that I'm working on. That's a lock which uses
> QUORUM. So as mentioned earlier, you may get in trouble with reads if you
> have problems with some nodes (but from what I understand, if any one write
> uses QUORUM you run in the same problem anyway.)
>
> Frankly, in my website environment, not having what I would call a *standard
> lock* is a real annoying problem! I currently need it in only two
> circumstances: a user registers an account (which must be referenced by a
> unique email address), and a user creates a "new page" (each page must be
> unique on a per website basis, also in our environment a "page" means any
> type of content but in most cases each one of those will be unique anyway.)
>
> A queue is not so good for us because 99.9% of the time we want a quick
> response and if two users try to register with different email addresses we
> get an "instant lock." I will no need to wait on my locks because I'll be
> locking a unique piece of content. The penalty is about 3 reads, 3 writes
> and 2 drops total (lock+unlock) if you are the only person acquiring the
> lock. Additional reads and sleeps are required if others acquired the lock.
> I tested with 24 processes total on 3 client computers and got the expected
> result every second for 300 seconds (5 min.) running my test against a
> Cassandra cluster made of 3 nodes (24 x 300 = 7,200 successful locks which
> clashed every second.)
>
> You may be interested in checking those out for more details:
>
> http://snapwebsites.org/journal/2013/01/inter-process-inter-computer-lock-cassandra
>
> http://snapwebsites.org/project/libqtcassandra
>
> http://snapwebsites.org/mo_references_view/libQtCassandra-doc-0.4/classQtCassandra_1_1QCassandraLock.html#details
>
>
> ------------
> Alexis Wilke
> CEO
> Made to Order Software Corporation
> http://snapwebsites.org/
>
>
>
>
> Daniel Godás wrote:
>>
>> This sounds reasonable. Thanks for the guidance, I think I just cut
>> down this project's development time by half...
>>
>> 2013/1/31 Oleg Dulin<oleg.du...@liquidanalytics.com>:
>>
>>>
>>> QUORUM means that (RF/2+1) nodes must respond. See this calculator:
>>>
>>> http://www.ecyrd.com/cassandracalculator/
>>>
>>> If you use RF=2 and you have 4 nodes you cannot survive outage of any
>>> node if you use QUORUM.
>>>
>>> So you are introducing a point of failure.
>>>
>>> If you need to do reads before writes I strongly suggest a caching
>>> mechanism, and if you want to do locking I suggest queues. I wouldn't send
>>> CQL statements on queues though, but that works. I would send Java business
>>> objects, but that's my personal preference.
>>>
>>> If you are frequently updating and reading the same data, that's an
>>> anti-pattern in Cassandra. Reads will be competing for resources with
>>> compactions, etc. And of course, unless you use QUORUM or ALL you really
>>> shouldn't rely on a read-before-write.
>>>
>>>
>>> Regards,
>>> Oleg Dulin
>>> Please note my new office #: 732-917-0159
>>>
>>> On Jan 31, 2013, at 9:35 AM, Daniel Godás<dgo...@gmail.com>  wrote:
>>>
>>>
>>>>
>>>> Can I not get away with it using QUORUM? Correct me if I'm wrong but I
>>>> think the only way I can get a read consistency issue using QUORUM is
>>>> if 2 nodes fail after a write and before the TTL expires. I can live
>>>> with the overhead of using QUORUM for the locking operations as they
>>>> won't be used often.
>>>>
>>>> 2013/1/31 Oleg Dulin<oleg.du...@liquidanalytics.com>:
>>>>
>>>>>
>>>>> The problem with using Cassandra for locking is that if you ever have
>>>>> read consistency issues (which you will unless you use ALL) you will have
>>>>> inconsistent values.
>>>>>
>>>>> In general I would avoid doing a read-before-write with Cassandra. I
>>>>> would come up with another way to update my data.
>>>>>
>>>>> Regards,
>>>>> Oleg Dulin
>>>>> Please note my new office #: 732-917-0159
>>>>>
>>>>> On Jan 31, 2013, at 9:19 AM, Daniel Godás<dgo...@gmail.com>  wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> Ok, I've done some reading. If I understood it correctly the idea
>>>>>> would be to send messages to the queue that contain a transaction i.e.
>>>>>> a list of CQL commands to be run atomically. When one of the consumers
>>>>>> gets the message it can run the transaction atomically before allowing
>>>>>> another consumer to get the next message. If this is correct then in
>>>>>> order to handle cases in which I need to interleave code with the CQL
>>>>>> statements e.g. to check retrieved values, I need to implement a
>>>>>> protocol that uses the message queue as a locking mechanism. How is
>>>>>> this better than using cassandra for locking? (using the algorithm I
>>>>>> proposed or another one).
>>>>>>
>>>>>> 2013/1/31 Oleg Dulin<oleg.du...@liquidanalytics.com>:
>>>>>>
>>>>>>>
>>>>>>> This may help:
>>>>>>>
>>>>>>> http://activemq.apache.org/how-do-distributed-queues-work.html
>>>>>>>
>>>>>>> http://activemq.apache.org/topologies.html
>>>>>>>
>>>>>>>
>>>>>>> http://activemq.apache.org/how-do-i-embed-a-broker-inside-a-connection.html
>>>>>>>
>>>>>>> Although I would use ActiveMQ spring configuration, not write code.
>>>>>>> But the point is -- you can have multiple processes participating in an
>>>>>>> ActiveMQ federation; you can configure AMQ's fault tolerance profiles to
>>>>>>> your liking without having to set up a yet another server with a single
>>>>>>> point of failure.
>>>>>>>
>>>>>>> You have a single distributed queue. Each process has a writer
>>>>>>> consumer on that queue. AMQ knows to load balance, only one consumer at 
>>>>>>> a
>>>>>>> time gets to write. Instead of writing to cassandra, you send your data 
>>>>>>> item
>>>>>>> to the queue. The next available consumer gets the message and writes 
>>>>>>> it --
>>>>>>> all in the order of messages on the queue, and only one consumer writer 
>>>>>>> at a
>>>>>>> time.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Oleg Dulin
>>>>>>> Please note my new office #: 732-917-0159
>>>>>>>
>>>>>>> On Jan 31, 2013, at 8:11 AM, Daniel Godás<dgo...@gmail.com>  wrote:
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Sounds good, I'll try it out. Thanks for the help.
>>>>>>>>
>>>>>>>> 2013/1/31 Oleg Dulin<oleg.du...@liquidanalytics.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Use embedded amq brokers , no need set up any servers  . It
>>>>>>>>> literally is
>>>>>>>>> one line of code to turn it on, and 5 lines of code to implement
>>>>>>>>> what you
>>>>>>>>> want.
>>>>>>>>>
>>>>>>>>> We have a cluster of servers writing to Cassandra this way and we
>>>>>>>>> are not
>>>>>>>>> using any j2ee containers.
>>>>>>>>>
>>>>>>>>> On Thursday, January 31, 2013, Daniel Godás wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Doesn't that require you to set up a server for the message queue
>>>>>>>>>> and
>>>>>>>>>> know it's address? That sort of defeats the purpose of having a
>>>>>>>>>> database like cassandra in which all nodes are equal and there's
>>>>>>>>>> no
>>>>>>>>>> single point of failure.
>>>>>>>>>>
>>>>>>>>>> 2013/1/31 Oleg
>>>>>>>>>> Dulin<oleg.du...@liquidanalytics.com<javascript:;>>:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Use a JMS message queue to send objects you want to write. Your
>>>>>>>>>>> writer
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> process then will listen on this queue and write to Cassandra.
>>>>>>>>>> This ensures
>>>>>>>>>> that all writes happen in an orderly fashion, one batch at a time.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I suggest ActiveMQ. It is easy to set up. This is what we use for
>>>>>>>>>>> this
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> type of a use case. No need to overcomplicate this with Cassandra.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Oleg Dulin
>>>>>>>>>>> Please note my new office #: 732-917-0159
>>>>>>>>>>>
>>>>>>>>>>> On Jan 31, 2013, at 6:35 AM, Daniel
>>>>>>>>>>> Godás<dgo...@gmail.com<javascript:;>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> I need a locking mechanism on top of cassandra so that multiple
>>>>>>>>>>>> clients can protect a critical section. I've seen some attempts,
>>>>>>>>>>>> including Dominic Williams' wait chain algorithm but I think it
>>>>>>>>>>>> can be
>>>>>>>>>>>> simplified. This is the procedure I wrote to implement a simple
>>>>>>>>>>>> mutex.
>>>>>>>>>>>> Note that it hasn't been thoroughly tested and I have been using
>>>>>>>>>>>> cassandra for a very short time so I'd appreciate any comments
>>>>>>>>>>>> on
>>>>>>>>>>>> obvious errors or things I'm doing plain wrong and will never
>>>>>>>>>>>> work.
>>>>>>>>>>>>
>>>>>>>>>>>> The assumptions and requirements for the algorithm are the same
>>>>>>>>>>>> as
>>>>>>>>>>>> Dominic Williams'
>>>>>>>>>>>> (
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://media.fightmymonster.com/Shared/docs/Wait%20Chain%20Algorithm.pdf).
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> We will create a column family for the locks referred to as
>>>>>>>>>>>> "locks"
>>>>>>>>>>>> throughout this procedure. The column family contains two
>>>>>>>>>>>> columns; an
>>>>>>>>>>>> identifier for the lock  which will also be the column key
>>>>>>>>>>>> ("id") and
>>>>>>>>>>>> a counter ("c"). Throughout the procedure "my_lock_id" will be
>>>>>>>>>>>> used as
>>>>>>>>>>>> the lock identifier. An arbitrary time-to-live value is required
>>>>>>>>>>>> by
>>>>>>>>>>>> the algorithm. This value will be referred to as "t". Choosing
>>>>>>>>>>>> an
>>>>>>>>>>>> appropriate value for "t" will be postponed until the algorithm
>>>>>>>>>>>> is
>>>>>>>>>>>> deemed good.
>>>>>>>>>>>>
>>>>>>>>>>>> === begin procedure ===
>>>>>>>>>>>>
>>>>>>>>>>>> (A) When a client needs to access the critical section the
>>>>>>>>>>>> following
>>>>>>>>>>>> steps are taken:
>>>>>>>>>>>>
>>>>>>>>>>>> --- begin ---
>>>>>>>>>>>>
>>>>>>>>>>>> 1) SELECT c FROM locks WHERE id = my_lock_id
>>>>>>>>>>>> 2) if c = 0 try to acquire the lock (B), else don't try (C)
>>>>>>>>>>>>
>>>>>>>>>>>> --- end ---
>>>>>>>>>>>>
>>>>>>>>>>>> (B) Try to acquire the lock:
>>>>>>>>>>>>
>>>>>>>>>>>> --- begin ---
>>>>>>>>>>>>
>>>>>>>>>>>> 1) UPDATE locks USING TTL t SET c = c + 1 WHERE id = my_lock_id
>>>>>>>>>>>> 2) SELECT c FROM locks WHERE id = my_lock_id
>>>>>>>>>>>> 3) if c = 1 we acquired the lock (D), else we didn't (C)
>>>>>>>>>>>>
>>>>>>>>>>>> --- end ---
>>>>>>>>>>>>
>>>>>>>>>>>> (C) Wait before re-trying:
>>>>>>>>>>>>
>>>>>>>>>>>> --- begin ---
>>>>>>>>>>>>
>>>>>>>>>>>> 1) sleep for a random time higher than t and start at (A) again
>>>>>>>>>>>>
>>>>>>>>>>>> --- end ---
>>>>>>>>>>>>
>>>>>>>>>>>> (D) Execute the critical section and release the lock:
>>>>>>>>>>>>
>>>>>>>>>>>> --- begin ---
>>>>>>>>>>>>
>>>>>>>>>>>> 1) start background thread that increments c with TTL = t every
>>>>>>>>>>>> t / 2
>>>>>>>>>>>> interval (UPDATE locks USING TTL t SET c = c + 1 WHERE id =
>>>>>>>>>>>> my_lock_id)
>>>>>>>>>>>> 2) execute the critical section
>>>>>>>>>>>> 3) kill background thread
>>>>>>>>>>>> 4) DELETE * FROM locks WHERE id = my_lock_id
>>>>>>>>>>>>
>>>>>>>>>>>> --- end ---
>>>>>>>>>>>>
>>>>>>>>>>>> === end procedure ===
>>>>>>>>>>>>
>>>>>>>>>>>> Looking forward to read your comments,
>>>>>>>>>>>> Dan
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Sent from Gmail Mobile
>>>>>>>>>

Re: Implementing a locking mechanism - RFC

Reply via email to