Thanks that makes it more clear. I also looked at mvcc code as you pointed out.
So I am wondering where ZK is used specifically. On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl <[email protected]> wrote: > Nope, not using ZK, that would not scale down to the cell level. > You'll probably have to stare at the code in > MultiVersionConsistencyControlfor a while (I know I had to). > > The basic flow of a write operation is this: > 1. lock the row > > 2. persist change to the write ahead log > 3. get a "writenumber" from mvcc (this is basically a timestamp) > > 4. apply change to the memstore (using that write number). > 5. advance the readpoint (maximum timestamp of changes that reads will see) > -- this is the point where readers see the change > 6. unlock the row > > (7. when memstore is full, flush it to a new disk file, but is done > asynchronously, and not really important, although it has some complicated > implications when the flush happens while there are readers reading from an > old read point) > > > The above is relaxed sometimes for idempotent operations. > > -- Lars > > > ----- Original Message ----- > From: Mohit Anchlia <[email protected]> > To: [email protected]; lars hofhansl <[email protected]> > Cc: > Sent: Thursday, December 1, 2011 3:03 PM > Subject: Re: Atomicity questions > > Thanks. I'll try and take a look, but I haven't worked with zookeeper > before. Does it use zookeeper for any of ACID functionality? > > On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl <[email protected]> wrote: >> Hi Mohit, >> >> the best way to study this is to look at MultiVersionConsistencyControl.java >> (since you are asking how this handled internally). >> >> In a nutshell this ensures that read operations don't see writes that are >> not completed, by (1) defining a thread read point that is rolled forward >> only after a completed operations and (2) assigning a special timestamp (not >> the timestamp that you set from the client API) to all KeyValues. >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Mohit Anchlia <[email protected]> >> To: [email protected] >> Cc: >> Sent: Thursday, December 1, 2011 2:22 PM >> Subject: Atomicity questions >> >> I have some questions about ACID after reading this page, >> http://hbase.apache.org/acid-semantics.html >> >> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or >> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". >> >> How is this internally handled in hbase such that above is possible? >> >> > >
