Yes, each CF has its own memtable. The writes are atomic in the sense that I can still do an all-or-nothing write to multiple CFs (the CommitLog still logs the whole row). Having multiple CFs with their own memtable simply means that concurrent operations may not be *isolated* from each other. So, if I have 2 operations:
Op1: Write(key1, CF1:col1=new, CF2:col2=new) Op2: Read(key1, CF1:col1, CF2:col2) Assuming both columns had "old" as the previous value, based on the exec schedule Op2 could return one of: old, old <-- Op2 before Op1 old, new <-- Op1 writes CF2, then Op2 gets scheduled new, old <-- Op1 writes CF1, then Op2 gets scheduled new, new <-- Op1 before Op2 But with time (eventually), re-execution of Op2 will always return the last result. I agree that this is of limited value right now, but atomicity without isolation can still be useful. It'll save the app some cleanup and book-keeping code. On Wed, Apr 22, 2009 at 9:36 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > On Wed, Apr 22, 2009 at 11:32 AM, Sandeep Tata <sandeep.t...@gmail.com> wrote: >> Having multiple CFs in a row could be useful for writes. Consider the >> case when you use one CF to store the data and another to store some >> kind of secondary index on that data. It will be useful to apply >> updates to both families atomically. > > Except that's not how it works, each Memtable (CF) has its own > executor thread so even if you put multiple CFs in a Row it's not > going to be atomic with the current system, and it's a big enough > change to try to add some kind of coordination there that I don't > think it's worth it. (And you have seen that I am not scared of big > changes, so that should give you pause. :) > > Back to YAGNI. :) Row doesn't fit in the current execution model, so > rather than leaving it as a half-baked creation, better to excise it > and if we ever decide to support atomic updates across CFs then that > would be the time to add it or something like it back. > > -Jonathan >