Ok, thanks for the clarification. I don't see any major downsides
beyond the ones you already mentioned. The inability to replicate
between versions is a bit of a bummer -- I'd want to at least look
into a bridge that lets old servers replicate to new ones.
Your point about reducing the chance of collision is a good one,
especially since Couch is using a 32 bit sample space for revision
IDs. The probability of zero collisions between any two revisions in
a given document history is
N!/((N-M)! * N^M)
with N = 2**32 and M = "max rev history". With M = 128, that
probability drops to 0.999998. In a 400k document DB where each doc
has the max number of revisions it's likely that at least one has a
duplicate rev. That's no good. I think we could eventually see
transient cases of revisions being skipped by the replicator with the
trunk code.
Adding the revseq doesn't reduce the chances of a duplicate rev, but
it does mean that replication won't accidentally match revisions from
different revseqs. Instead, the concern would be that two different
servers would generate the same revision ID from different updates at
the same revseq. It's a concern only for multi-master setups, and
even then each document that had been updated on both source and
target would only have a 1/N chance of being skipped due to an
accidentally matching revision. I guess it would happen once every 3
billion times or so.
Or Couch could switch to a 64 bit space for the revision IDs ;-)
Adam
On Feb 8, 2009, at 2:40 PM, Damien Katz wrote:
I don't think it's strictly necessary, but it makes merging new
edits simpler and it significantly reduces the chances of collisions
between revision ids, there is less ambiguity. What downsides do
your see?
-Damien
On Feb 8, 2009, at 2:28 PM, Adam Kocoloski wrote:
Hi Damien, it seems to me that you're conflating two separate
issues. I agree that the revision history should be trimmed, and
that this will potentially introduce spurious conflicts when two
servers have no shared history for a document. I don't see how
this change by itself requires the addition of a revseq to the JSON
revision format. Is it really required?
Adam