Hi,

CouchDB 2.0 introduced clustering; the ability to scale a single database 
across multiple nodes, increasing both the maximum size of a database and 
adding native fault-tolerance. This welcome and considerable step forward was 
not without its trade-offs. In the years since 2.0 was released, users 
frequently encounter the following issues as a direct consequence of the 2.0 
clustering approach:

1. Conflict revisions can be created on normal concurrent updates issued to a 
single database, since each replica of a database shard independently chooses 
whether to accept a given update, and all replicas will eventually propagate 
updates that any one of them has chosen to accept.
2. Secondary indexes ("views") do not scale the same way as document lookups, 
as they are sharded by doc id, not emitted view key (thus forcing a 
consultation of all shard ranges for each query).
3. The changes feed is no longer totally ordered and, worse, could replay 
earlier changes in the event of a node failure (even a temporary one).

The idea is to use FoundationDB as the new CouchDB foundational layer, letting 
it take care of data storage and placement. An introduction to FoundationDB 
would take up too much space here so I will summarise it as a highly scalable 
ordered key-value store with transactional semantics, provides strong 
consistency, scaling from a single node to many. It is licensed under the ASLv2 
but is not an Apache project.

By using FoundationDB we can solve all three of the problems listed above and 
deliver semantics much closer to CouchDB 1.x's behaviour while improving upon 
the scalability advantages that 2.0 introduced. The essential character of 
CouchDB would be preserved (MVCC for documents, replication between CouchDB 
databases) but the underlying plumbing would change significantly. In addition, 
this new foundation will allow us to add long wished-for features more easily. 
For example, multi-document transactions become possible, as does efficient 
field-level reading and writing. A further thought is the ability to update 
views transactionally with the database update.

For those familiar with the CouchDB 2.0 architecture, the proposal is, in 
effect, to change all the functions in fabric.erl so that they work against a 
(possibly remote) FoundationDB cluster instead of the current implementation of 
calling into the original CouchDB 1.x code (couch_btree, couch_file, etc).

This is a large change and, for full disclosure, the IBM Cloudant team are 
proposing it. We have done our due diligence in investigating FoundationDB as 
well as detailed investigation into how CouchDB semantics would be built on top 
of FoundationDB. Any and all decisions on that must take place here on the 
CouchDB developer mailing list, of course, but we are confident that this is 
feasible.
During those investigations we have identified a small number of CouchDB 
features that we do not yet see a way to do on FoundationDB, the main one being 
custom (Javascript) reduces. This is a direct consequence of no longer rolling 
our own persistence layer (couch_btree and friends) and would likely apply to 
any alternative technology. 

I think this would be a great advance for CouchDB, preserving what makes 
CouchDB special but taking advantage of the superbly engineered FoundationDB 
software at the bottom of the stack.

Regards,
Robert Newson

Reply via email to