QUORUM means that (RF/2+1) nodes must respond. See this calculator: http://www.ecyrd.com/cassandracalculator/
If you use RF=2 and you have 4 nodes you cannot survive outage of any node if you use QUORUM. So you are introducing a point of failure. If you need to do reads before writes I strongly suggest a caching mechanism, and if you want to do locking I suggest queues. I wouldn't send CQL statements on queues though, but that works. I would send Java business objects, but that's my personal preference. If you are frequently updating and reading the same data, that's an anti-pattern in Cassandra. Reads will be competing for resources with compactions, etc. And of course, unless you use QUORUM or ALL you really shouldn't rely on a read-before-write. Regards, Oleg Dulin Please note my new office #: 732-917-0159 On Jan 31, 2013, at 9:35 AM, Daniel Godás <dgo...@gmail.com> wrote: > Can I not get away with it using QUORUM? Correct me if I'm wrong but I > think the only way I can get a read consistency issue using QUORUM is > if 2 nodes fail after a write and before the TTL expires. I can live > with the overhead of using QUORUM for the locking operations as they > won't be used often. > > 2013/1/31 Oleg Dulin <oleg.du...@liquidanalytics.com>: >> The problem with using Cassandra for locking is that if you ever have read >> consistency issues (which you will unless you use ALL) you will have >> inconsistent values. >> >> In general I would avoid doing a read-before-write with Cassandra. I would >> come up with another way to update my data. >> >> Regards, >> Oleg Dulin >> Please note my new office #: 732-917-0159 >> >> On Jan 31, 2013, at 9:19 AM, Daniel Godás <dgo...@gmail.com> wrote: >> >>> Ok, I've done some reading. If I understood it correctly the idea >>> would be to send messages to the queue that contain a transaction i.e. >>> a list of CQL commands to be run atomically. When one of the consumers >>> gets the message it can run the transaction atomically before allowing >>> another consumer to get the next message. If this is correct then in >>> order to handle cases in which I need to interleave code with the CQL >>> statements e.g. to check retrieved values, I need to implement a >>> protocol that uses the message queue as a locking mechanism. How is >>> this better than using cassandra for locking? (using the algorithm I >>> proposed or another one). >>> >>> 2013/1/31 Oleg Dulin <oleg.du...@liquidanalytics.com>: >>>> This may help: >>>> >>>> http://activemq.apache.org/how-do-distributed-queues-work.html >>>> >>>> http://activemq.apache.org/topologies.html >>>> >>>> http://activemq.apache.org/how-do-i-embed-a-broker-inside-a-connection.html >>>> >>>> Although I would use ActiveMQ spring configuration, not write code. But >>>> the point is -- you can have multiple processes participating in an >>>> ActiveMQ federation; you can configure AMQ's fault tolerance profiles to >>>> your liking without having to set up a yet another server with a single >>>> point of failure. >>>> >>>> You have a single distributed queue. Each process has a writer consumer on >>>> that queue. AMQ knows to load balance, only one consumer at a time gets to >>>> write. Instead of writing to cassandra, you send your data item to the >>>> queue. The next available consumer gets the message and writes it -- all >>>> in the order of messages on the queue, and only one consumer writer at a >>>> time. >>>> >>>> Regards, >>>> Oleg Dulin >>>> Please note my new office #: 732-917-0159 >>>> >>>> On Jan 31, 2013, at 8:11 AM, Daniel Godás <dgo...@gmail.com> wrote: >>>> >>>>> Sounds good, I'll try it out. Thanks for the help. >>>>> >>>>> 2013/1/31 Oleg Dulin <oleg.du...@liquidanalytics.com>: >>>>>> Use embedded amq brokers , no need set up any servers . It literally is >>>>>> one line of code to turn it on, and 5 lines of code to implement what you >>>>>> want. >>>>>> >>>>>> We have a cluster of servers writing to Cassandra this way and we are not >>>>>> using any j2ee containers. >>>>>> >>>>>> On Thursday, January 31, 2013, Daniel Godás wrote: >>>>>> >>>>>>> Doesn't that require you to set up a server for the message queue and >>>>>>> know it's address? That sort of defeats the purpose of having a >>>>>>> database like cassandra in which all nodes are equal and there's no >>>>>>> single point of failure. >>>>>>> >>>>>>> 2013/1/31 Oleg Dulin <oleg.du...@liquidanalytics.com <javascript:;>>: >>>>>>>> Use a JMS message queue to send objects you want to write. Your writer >>>>>>> process then will listen on this queue and write to Cassandra. This >>>>>>> ensures >>>>>>> that all writes happen in an orderly fashion, one batch at a time. >>>>>>>> >>>>>>>> I suggest ActiveMQ. It is easy to set up. This is what we use for this >>>>>>> type of a use case. No need to overcomplicate this with Cassandra. >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> Oleg Dulin >>>>>>>> Please note my new office #: 732-917-0159 >>>>>>>> >>>>>>>> On Jan 31, 2013, at 6:35 AM, Daniel Godás >>>>>>>> <dgo...@gmail.com<javascript:;>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I need a locking mechanism on top of cassandra so that multiple >>>>>>>>> clients can protect a critical section. I've seen some attempts, >>>>>>>>> including Dominic Williams' wait chain algorithm but I think it can be >>>>>>>>> simplified. This is the procedure I wrote to implement a simple mutex. >>>>>>>>> Note that it hasn't been thoroughly tested and I have been using >>>>>>>>> cassandra for a very short time so I'd appreciate any comments on >>>>>>>>> obvious errors or things I'm doing plain wrong and will never work. >>>>>>>>> >>>>>>>>> The assumptions and requirements for the algorithm are the same as >>>>>>>>> Dominic Williams' >>>>>>>>> ( >>>>>>> http://media.fightmymonster.com/Shared/docs/Wait%20Chain%20Algorithm.pdf). >>>>>>>>> >>>>>>>>> We will create a column family for the locks referred to as "locks" >>>>>>>>> throughout this procedure. The column family contains two columns; an >>>>>>>>> identifier for the lock which will also be the column key ("id") and >>>>>>>>> a counter ("c"). Throughout the procedure "my_lock_id" will be used as >>>>>>>>> the lock identifier. An arbitrary time-to-live value is required by >>>>>>>>> the algorithm. This value will be referred to as "t". Choosing an >>>>>>>>> appropriate value for "t" will be postponed until the algorithm is >>>>>>>>> deemed good. >>>>>>>>> >>>>>>>>> === begin procedure === >>>>>>>>> >>>>>>>>> (A) When a client needs to access the critical section the following >>>>>>>>> steps are taken: >>>>>>>>> >>>>>>>>> --- begin --- >>>>>>>>> >>>>>>>>> 1) SELECT c FROM locks WHERE id = my_lock_id >>>>>>>>> 2) if c = 0 try to acquire the lock (B), else don't try (C) >>>>>>>>> >>>>>>>>> --- end --- >>>>>>>>> >>>>>>>>> (B) Try to acquire the lock: >>>>>>>>> >>>>>>>>> --- begin --- >>>>>>>>> >>>>>>>>> 1) UPDATE locks USING TTL t SET c = c + 1 WHERE id = my_lock_id >>>>>>>>> 2) SELECT c FROM locks WHERE id = my_lock_id >>>>>>>>> 3) if c = 1 we acquired the lock (D), else we didn't (C) >>>>>>>>> >>>>>>>>> --- end --- >>>>>>>>> >>>>>>>>> (C) Wait before re-trying: >>>>>>>>> >>>>>>>>> --- begin --- >>>>>>>>> >>>>>>>>> 1) sleep for a random time higher than t and start at (A) again >>>>>>>>> >>>>>>>>> --- end --- >>>>>>>>> >>>>>>>>> (D) Execute the critical section and release the lock: >>>>>>>>> >>>>>>>>> --- begin --- >>>>>>>>> >>>>>>>>> 1) start background thread that increments c with TTL = t every t / 2 >>>>>>>>> interval (UPDATE locks USING TTL t SET c = c + 1 WHERE id = >>>>>>>>> my_lock_id) >>>>>>>>> 2) execute the critical section >>>>>>>>> 3) kill background thread >>>>>>>>> 4) DELETE * FROM locks WHERE id = my_lock_id >>>>>>>>> >>>>>>>>> --- end --- >>>>>>>>> >>>>>>>>> === end procedure === >>>>>>>>> >>>>>>>>> Looking forward to read your comments, >>>>>>>>> Dan >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Sent from Gmail Mobile >>>> >>