Yeah ... I have a local copy which adds the port for just that sort of
testing.
If we add the port and do nothing else, all existing replications in
production will re-synchronize from zero. That could take weeks for
large deloyments.
Perhaps we could look for the replication doc which includes the port
in the hash content, and if it's a 404 look for the one that does not?
Adam
On Jun 22, 2010, at 4:37 PM, Robert Newson <[email protected]>
wrote:
All,
I was testing some clustering code locally and noticed that my
gossip-based state replication was always reporting;
[info] [<0.361.0>] Replication records differ. Scanning histories to
find a common ancestor.
[info] [<0.361.0>] no common ancestry -- performing full replication
I assumed it was my code (since I call couch_rep:replicate directly)
but it wasn't.
it turns out that make_replication_id ignores the port, which was the
only thing that varied when testing a three node system on my laptop.
This caused all the checkpoint documents to get overwritten, forcing
full replication every time.
I'm sure there's a reason ("%% funky algorithm to preserve backwards
compatibility") but I wonder if this could be re-examined? It worries
me that testing multiple couchdb instances locally behaves differently
(actually, incorrectly) than testing multiple couchdb instances on
real, separate hosts.
B.