Seems to be what Merkle trees are for, which would allow for the kinds of fast-forwarding this thread appears to be discussing. I think that's essentially (or exactly) what git does, fwiw.
If couchdb tracked replication by a Merkle tree, it would obsolete the update_seq mechanism? B. On Tue, Feb 2, 2010 at 8:17 PM, Adam Kocoloski <[email protected]> wrote: > On Feb 2, 2010, at 2:48 PM, Randall Leeds wrote: > >> On Tue, Feb 2, 2010 at 11:39, Chris Anderson <[email protected]> wrote: >>> On Tue, Feb 2, 2010 at 11:25 AM, Randall Leeds <[email protected]> >>> wrote: >>>> I'm not entirely happy with this patch and I'd like some help figuring >>>> out what to do about it. >>>> >>>> I foresee problems when database files are copied or backed up on >>>> disk. It's possible to end up with two couchdb instances hosting >>>> databases with the same uuid. The problem is that the uuid is no >>>> longer meaningful, as it doesn't do what it was intended to (uniquely >>>> identify the database). >>>> >>>> Can anyone see a way around this? >>>> >>> >>> I think we don't mind this. As I mentioned above, when we see that 2 >>> db files have the same uuid we can do a fast-forward replication by >>> starting from the lower of the 2 dbs sequence #s for replication. >>> (maybe... Adam, does this sound sane?) >> >> If changes had been made to both dbs separately then the lower >> sequence # might be beyond the sequence number at which the histories >> diverged and the changes to the "younger" db would be lost. > > Yes, that's the problem we'll need to solve if we're going to use UUIDs to > fast-forward replication. Off the top of my head, one way to do that would > be store a DB revid calculated in the same way as the document revids (and > seed it with the UUID at the beginning). Then if you find an update_seq > where the revision IDs match, you can start the replication from that point. > > There may be cheaper ways, though. > > Adam
