The replicator searches for a checkpoint document on source and target when it starts. This document identifies the update_seq that the previous replication had reached. The checkpoint document's id is derived from the source hostname:port and target hostname:port (plus some other properties).
B. On 11 Aug 2012, at 14:38, Ladislav Thon wrote: > Friendly ping? :-) > > LT > > 2012/6/27 Ladislav Thon <[email protected]> > >> Hi, >> >> we're using CouchDB (version 1.1.1 currently, but planning to upgrade to >> 1.2.0) because of its multi-master replication. The replication topology is >> a simple star -- single central server and a number of clients that >> replicate both from and to the central server. Writes are (almost) always >> done on the clients. >> >> Now for high availability, the central server isn't actually a single >> machine, but two machines (and therefore two couches) whose IP addresses >> are mapped to the same domain name (DNS round robin). These two couches >> also replicate with each other. The clients don't know about this, they >> always replicate from and to https://central.couch:6984/database. >> >> This might not be the best architecture for HA and we would be able to >> change it, but I'd still love to get an answer to this question: is CouchDB >> able to cope with this? How does it know that it replicates with the same >> couch it replicated with before (so that it only has to replay changes) and >> how does it recognize that it replicates with a different couch than before >> (and has to copy the whole database)? >> >> I know that it was already proposed several times to add an UUID to >> CouchDB server/database, which would solve this issue, and I also know that >> it's very easy to end up with duplicates, which renders universallly >> unique identifiers ... not so *unique* (i.e. useless). >> >> --- >> >> Also, I have a question about replication monitoring. Are there some best >> practices for monitoring whether the replication is working? I can of >> course read the corresponding document in the _replicator database and look >> at the _replication_state field, but this will only tell me that the >> replication is *running* -- and I want to know that it's actually *working >> *. For now, we are using a pretty naive approach: 1. Every 10 minutes, >> write a document with current date and time to the central couch. 2. >> Periodically check on all clients (we have them under control) that the >> document isn't too old. Is there a better approach? >> >> Thanks for your opinions! >> >> LT >>
