Yes, each CF has its own memtable. The writes are atomic in the sense
that I can still do an all-or-nothing write to multiple CFs (the
CommitLog still logs the whole row). Having multiple CFs with their
own memtable simply means that concurrent operations may not be
*isolated* from each other. So, if I have 2 operations:

Op1: Write(key1, CF1:col1=new, CF2:col2=new)
Op2: Read(key1, CF1:col1, CF2:col2)

Assuming both columns had "old" as the previous value, based on the
exec schedule Op2 could return one of:

old, old  <-- Op2 before Op1
old, new <-- Op1 writes CF2, then Op2 gets scheduled
new, old <-- Op1 writes CF1, then Op2 gets scheduled
new, new <-- Op1 before Op2

But with time (eventually), re-execution of Op2 will always return the
last result.

I agree that this is of limited value right now, but atomicity without
isolation can still be useful. It'll save the app some cleanup and
book-keeping code.



On Wed, Apr 22, 2009 at 9:36 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
> On Wed, Apr 22, 2009 at 11:32 AM, Sandeep Tata <sandeep.t...@gmail.com> wrote:
>> Having multiple CFs in a row could be useful for writes. Consider the
>> case when you use one CF to store the data and another to store some
>> kind of secondary index on that data. It will be useful to apply
>> updates to both families atomically.
>
> Except that's not how it works, each Memtable (CF) has its own
> executor thread so even if you put multiple CFs in a Row it's not
> going to be atomic with the current system, and it's a big enough
> change to try to add some kind of coordination there that I don't
> think it's worth it.  (And you have seen that I am not scared of big
> changes, so that should give you pause. :)
>
> Back to YAGNI. :)  Row doesn't fit in the current execution model, so
> rather than leaving it as a half-baked creation, better to excise it
> and if we ever decide to support atomic updates across CFs then that
> would be the time to add it or something like it back.
>
> -Jonathan
>

Reply via email to