Hi there, I already asked this question on #couchdb but I'm not really satisfied with the answers I got. Just because there are some open questions left with no answer in IRC. I thought it could be a good idea to open the question for a wider group. I will paste both my original question and the answers I got in #couchdb.
Many thanks for your help, Jeldrik == This was the question (I just added some information): We are moving a couchdb to new hardware but we have a pull replication (couch_backup.example.com) which we want to keep. Our planned steps are like these: 1. rsync db files from couch_live.example.com to couch_new.example.com 2. compact dbs on couch_new (this is neccessary because on couch_live compression was turned off and is wished to be turned on now) # Meanwhile the couch_live is still live and data is pushed to it from clients and pulled by the couch_backup replication 3. start pull replication on couch_new with source couch_live and target couch_new for all dbs 4. if all dbs are nearly in sync have a short downtime until the data is fully in sync then turn over to couch_new 5. shutdown couch_live and the replication to couch_backup 6. new data is comming in to couch_new 7. start pull replication on couch_backup with source couch_new Now the question is how to keep the couch_backup replication? If I got it right the replication depends on two values. The first one is the uri to the source. So could a switch from couch_live.example.com/db1 to couch_new.example.com/db1 break the replication? The second one is or more precisely are the seq no. At the moment when we turn off the couch_live all three couch_live, couch_backup and couch_new will have the same data. So from the point of view of the data we have consistency. But maybe the seq no. differ. Of course the couch_new will immediately receive new data. So how can I convice the couch_backup to start replication from that one point of data consistency? == And these were responses and my following questions on IRC to it: 15:09 <mar-ia> jeldrik: couch_backup will continue from the last data it has. You should not need to wory about it. If I have understood everything correctly :) 15:37 <jeldrik> mar-ia: thx. but how sure are you about that? the problem is that couch_backup is on a remote site. and it happened to them when we had a similar system move. 15:44 <mar-ia> jeldrik: Every node knows the last change it has. So when it starts a replication it askes for all the changes made after that point. It does not get the complete history, only the latest version (as always). 15:49 <jeldrik> but if i got it right it does that with the checkpoints aka seq no., doesn't it? and we had situations where the seq. no of a replication differed from the source. so couldn't it happen that the new system has a lower seq no. but new data and because of that after the change the backup couch asks like for "everything after 'higher seq no'" and then gets nothing 15:50 <jeldrik> what would break the consistency of the backup
