Re: proposed replication rev history changes

Damien Katz Sun, 08 Feb 2009 21:41:17 -0800


On Feb 9, 2009, at 12:31 AM, Adam Kocoloski wrote:

Ok, thanks for the clarification. I don't see any major downsidesbeyond the ones you already mentioned. The inability to replicatebetween versions is a bit of a bummer -- I'd want to at least lookinto a bridge that lets old servers replicate to new ones.
Your point about reducing the chance of collision is a good one,especially since Couch is using a 32 bit sample space for revisionIDs. The probability of zero collisions between any two revisionsin a given document history is
N!/((N-M)! * N^M)
with N = 2**32 and M = "max rev history". With M = 128, thatprobability drops to 0.999998. In a 400k document DB where each dochas the max number of revisions it's likely that at least one has aduplicate rev. That's no good. I think we could eventually seetransient cases of revisions being skipped by the replicator withthe trunk code.
Adding the revseq doesn't reduce the chances of a duplicate rev, butit does mean that replication won't accidentally match revisionsfrom different revseqs. Instead, the concern would be that twodifferent servers would generate the same revision ID from differentupdates at the same revseq. It's a concern only for multi-mastersetups, and even then each document that had been updated on bothsource and target would only have a 1/N chance of being skipped dueto an accidentally matching revision. I guess it would happen onceevery 3 billion times or so.
Or Couch could switch to a 64 bit space for the revision IDs ;-)

There is nothing preventing larger revs (or even non-integer revs) asit's just stored as a string (real efficient I know). The size couldeasily be a server or database setting.


-Damien

Adam

On Feb 8, 2009, at 2:40 PM, Damien Katz wrote:
I don't think it's strictly necessary, but it makes merging newedits simpler and it significantly reduces the chances ofcollisions between revision ids, there is less ambiguity. Whatdownsides do your see?
-Damien

On Feb 8, 2009, at 2:28 PM, Adam Kocoloski wrote:
Hi Damien, it seems to me that you're conflating two separateissues. I agree that the revision history should be trimmed, andthat this will potentially introduce spurious conflicts when twoservers have no shared history for a document. I don't see howthis change by itself requires the addition of a revseq to theJSON revision format. Is it really required?
Adam

Re: proposed replication rev history changes

Reply via email to