On Feb 4, 2010, at 10:44 AM, Paul Davis wrote:

> On Thu, Feb 4, 2010 at 10:19 AM, Adam Kocoloski <[email protected]> wrote:
>> On Feb 3, 2010, at 4:53 AM, Brian Candler wrote:
>> 
>>> On Tue, Feb 02, 2010 at 09:41:28PM +0000, Robert Newson wrote:
>>>> If couchdb tracked replication by a Merkle tree, it would obsolete the
>>>> update_seq mechanism?
>>> 
>>> Only if you weren't doing filtered/selective replication. And probably only
>>> if there was nothing else different between the two databases (e.g. _local
>>> docs, _design docs, reader acls etc)
>> 
>> Correct, Merkle trees are only useful if you expect the two databases to be 
>> completely identical.  But Bob's right, I'm essentially proposing that our 
>> by_seq btree is extended into a full Merkle tree for this particular 
>> use-case.
>> 
>> Adam
> 
> Most intriguing. Could you expand on that a bit?
> 
> Paul

Hi Paul,

The more I think about it using by_seq may not be the optimal choice here.  
Consider the case where I snapshot my .couch file over to a new server, and in 
the meantime I update the document that was occupying update_seq 1 on the 
original.  The analysis I proposed above would conclude that the replication 
needs to start from the beginning, which is true, but overlooks the fact that 
only one document has changed.

An alternative would be to do the Merkle stuff in the by_id tree, and instead 
of identifying the last update_seq where two DBs are identical, identify the 
set of documents that differ between the two DBs.  Replicate just those 
documents using Filipe's new patch, then record a checkpoint at the source's 
latest update_seq.  You're now fully caught up in case you're planning any 
future _changes-based incremental replications.

If we went ahead and implemented this I think the UUID becomes superfluous from 
the replicator's perspective.  You wouldn't want to restrict this Merkle tree 
check to UUID-matched DBs, as it would be useful for reducing entropy in a 
sharded database cluster that stores multiple copies of each document in 
different database shards.  In fact, IIRC that was a Dynamo feature in the 
original Amazon paper.

Adam




Reply via email to