On Mar 26, 2014, at 8:25 PM, Stanley Iriele <[email protected]> wrote:

> You said couchbase doesn't have MVCC ? All docs say That it
> uses couchDB MVCC append only under the good on a single node…

I’ve read that second line four times and I can’t figure out what the heck it 
means. “Under the good”?
Oh wait, you mean “under the hood”? Yes, the nodes use CouchDB databases for 
persistence. But the MVCC semantics aren’t exposed. See below.

> Could you elaborate a tad on what you mean by doesn't have MVCC?

MVCC stands for “Multi-Version Concurrency Control”. Wikipedia:

“…each user connected to the database sees a snapshot of the database at a 
particular instant in time. Any changes made by a writer will not be seen by 
other users of the database until the changes have been completed (or, in 
database terms: until the transaction has been committed.)
When an MVCC database needs to update an item of data, it will not overwrite 
the old data with new data, but instead mark the old data as obsolete and add 
the newer version elsewhere.”

Couchbase Server doesn’t store multiple versions of a document*. Nor does it 
use snapshots of the database state.

The key-value storage part of Couchbase Server is *NOTHING* like CouchDB. At 
all. Stop thinking of them as being related. It inherits from memcached, which 
is a distributed in-memory cache engine.

The CouchDB heritage of Couchbase Server is used for two things: (a) 
_asynchronously_ writing the documents to a CouchDB b-tree for persistence, and 
(b) indexing that b-tree with map-reduce views that can be queried queried 
almost exactly like CouchDB.

Each node has an in-memory key->value map that’s read and updated by clients. 
Note: writes are made directly in RAM. (This is part of the “insane speed” 
thing.) A parallel task collects all the updated values and writes them to a 
CouchDB-compatible b-tree file. Another parallel task sends the changes to the 
neighboring nodes so they can keep backups in case this node goes down.

So when client A does a Put and then client B does a Get, client B gets the 
value from RAM that client A wrote there a few microseconds earlier. The 
database file isn’t involved all. No MVCC.

This is getting long, but I want to add that in my experience if you try to use 
Couchbase Server as though it were CouchDB, you’ll get very frustrated —  
viewed that way it feels primitive and unreliable. You have to treat it as its 
own thing and accept that you’re making trade-offs for performance/scalability, 
and must use different hammers to solve your problems. (And of course, if you 
want CouchDB-style semantics, you can add the Couchbase Sync Gateway. But you 
won’t get the same performance as raw Couchbase.)

—Jens

* OK, technically the database file used for persistence contains multiple 
versions of the document since it’s append-only. But the old versions are never 
accessed, and there’s no way to get to them through the API, so they may as 
well not exist.

Reply via email to