fail fast with checkpoint conflicts
-----------------------------------
Key: COUCHDB-1080
URL: https://issues.apache.org/jira/browse/COUCHDB-1080
Project: CouchDB
Issue Type: Improvement
Components: Replication
Affects Versions: 1.0.2
Reporter: Randall Leeds
Fix For: 1.1, 1.2
I've thought about this long and hard and probably should have submitted the
bug a long time ago. I've also run this in production for months.
When a checkpoint conflict occurs it is almost always the right thing to do to
abort.
If there is a rev mismatch it could mean there's are two conflicting
(continuous and one-shot) replications between the same hosts running. Without
reloading the history documents checkpoints will continue to fail forever. This
could leave us in a state with many replicated changes but no checkpoints.
Similarly, a successful checkpoint but a lost/timed-out response could cause
this situation.
Since the supervisor will restart the replication anyway, I think it's safer to
abort and retry.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira