On Sat, Dec 18, 2010 at 16:41, Paul Davis <[email protected]> wrote: > Right, I probably jumped a couple steps there: > > The unique datums we have to work with here are the _id/_rev pairs. > Theoretically (if we ignore rev_stemming) the ordering with which > these come to us is roughly unimportant. > > So the issue with our history (append only) is that there's no real > way to order it such that we can efficiently seek through it to see > what we have in common (that I can think of). Ie, replication still > needs a way to say "I only need to send these bits". Right now its the > src/dst/seq triple that lets us zip through only new edits. > > Well, theoretically, we could keep a merkle tree of all edits we've > ever seen and go that way, but that'd require keeping a history of > every edit ever seen which could never be removed. > > Granted this is just quick thinking. I could definitely be missing > something clever. >
We're on the same page. I don't have anything clever yet either. The only other thing that's crossed my mind is some way to exchange information about checkpoints each participant has with a third party. You'd have to somehow verify that the checkpoint being presented to you is actually one created by the third party, which involves trust or verification. I like the verification route because I'd still love to decouple the endpoint from its hostname, the idea that I was stabbing quite horribly at when I prematurely proposed a couple patches to give databases uuids. But back to the point, something like "you got a bunch of edits since last we spoke, but I got these edits from this other endpoint, are they the same ones?" Even then, I'm not sure how this works without the merkle.
