Hi there,

I already asked this question on #couchdb but I'm not really satisfied
with the answers I got. Just because there are some open questions left
with no answer in IRC. I thought it could be a good idea to open the
question for a wider group. I will paste both my original question and
the answers I got in #couchdb.

Many thanks for your help,
Jeldrik

==

This was the question (I just added some information):

We are moving a couchdb to new hardware but we have a pull replication
(couch_backup.example.com) which we want to keep. Our planned steps are
like these:
1. rsync db files from couch_live.example.com to couch_new.example.com
2. compact dbs on couch_new (this is neccessary because on couch_live
compression was turned off and is wished to be turned on now)
# Meanwhile the couch_live is still live and data is pushed to it from
clients and pulled by the couch_backup replication
3. start pull replication on couch_new with source couch_live and target
couch_new for all dbs
4. if all dbs are nearly in sync have a short downtime until the data is
fully in sync then turn over to couch_new
5. shutdown couch_live and the replication to couch_backup
6. new data is comming in to couch_new
7. start pull replication on couch_backup with source couch_new
 
Now the question is how to keep the couch_backup replication? If I got
it right the replication depends on two values. The first one is the uri
to the source. So could a switch from couch_live.example.com/db1 to
couch_new.example.com/db1 break the replication? The second one is or
more precisely are the seq no. At the moment when we turn off the
couch_live all three couch_live, couch_backup and couch_new will have
the same data. So from the point of view of the data we have
consistency. But maybe the seq no. differ. Of course the couch_new will
immediately receive new data. So how can I convice the couch_backup to
start replication from that one point of data consistency?

==

And these were responses and my following questions on IRC to it:

15:09 <mar-ia> jeldrik: couch_backup will continue from the last data it
has. You should not need to wory about it. If I have understood
everything correctly :)
15:37 <jeldrik> mar-ia: thx. but how sure are you about that? the
problem is that couch_backup is on a remote site. and it happened to
them when we had a similar system move.
15:44 <mar-ia> jeldrik: Every node knows the last change it has. So when
it starts a replication it askes for all the changes made after that
point. It does not get the complete history, only the latest version (as
always).
15:49 <jeldrik> but if i got it right it does that with the checkpoints
aka seq no., doesn't it? and we had situations where the seq. no of a
replication differed from the source. so couldn't it happen that the new
system has a lower seq no. but new data and because of that after the
change the backup couch asks like for "everything after 'higher seq no'"
and then gets nothing
15:50 <jeldrik> what would break the consistency of the backup

Reply via email to