I think the problem is not as much deleting and recreating a database but wiping a virtual machine and restoring from a backup, now you have more or less gone back in time with the target database and it has different stuff but the same uuid.
On Tue, Apr 15, 2014 at 2:32 PM, Dale Harvey <[email protected]> wrote: > I dont understand the problem with per db uuids, so the uuid isnt > multivalued nor is it queried > > A is readyonly, B is client, B starts replication from A > B reads the db uuid from A / itself, generates a replication_id, stores > on B > try to fetch replication checkpoint, if successful we query changes from > since? > > In pouch we store the uuid along with the data, so file based backups arent > a problem, seems couchdb could / should do that too > > This also fixes the problem mentioned on the mailing list, and one I have > run into personally where people forward db requests but not server > requests via a proxy > > > On 15 April 2014 19:18, Calvin Metcalf <[email protected]> wrote: > > > except there is no way to calculate that from outside the database as > > changes only ever gives the more recent document version. > > > > > > On Sun, Apr 13, 2014 at 9:47 PM, Calvin Metcalf < > [email protected] > > >wrote: > > > > > oo didn't think of that, yeah uuids wouldn't hurt, though the more I > > think > > > about the rolling hashing on revs, the more I like that > > > > > > > > > On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski < > > [email protected]>wrote: > > > > > >> Yes, but then sysadmins have to be very very careful about restoring > > from > > >> a file-based backup. We run the risk that {uuid, seq} could be > > >> multi-valued, which diminishes its value considerably. > > >> > > >> I like the UUID in general -- we've added them to our internal shard > > >> files at Cloudant -- but on their own they're not a bulletproof > solution > > >> for read-only incremental replications. > > >> > > >> Adam > > >> > > >> > On Apr 13, 2014, at 5:16 PM, Calvin Metcalf < > [email protected] > > > > > >> wrote: > > >> > > > >> > I mean if your going to add new features to couch you could just > have > > >> the > > >> > db generate a random uuid on creation that would be different if it > > was > > >> > deleted and recreated > > >> >> On Apr 13, 2014 1:59 PM, "Adam Kocoloski" < > [email protected]> > > >> wrote: > > >> >> > > >> >> Other thoughts: > > >> >> > > >> >> - We could enhance the authorization system to have a role that > > allows > > >> >> updates to _local docs but nothing else. It wouldn't make sense for > > >> >> completely untrusted peers, but it could give peace of mind to > > >> sysadmins > > >> >> trying to execute replications with the minimum level of access > > >> possible. > > >> >> > > >> >> - We could teach the sequence index to maintain a report of rolling > > >> hash > > >> >> of the {id,rev} pairs that comprise the database up to that > sequence, > > >> >> record that in the replication checkpoint document, and check that > > it's > > >> >> unchanged on resume. It's a new API enhancement and it grows the > > >> amount of > > >> >> information stored with each sequence, but it completely closes off > > the > > >> >> probabilistic edge case associated with simply checking that the > {id, > > >> rev} > > >> >> associated with the checkpointed sequence has not changed. Perhaps > > >> overkill > > >> >> for what is admittedly a pretty low-probability event. > > >> >> > > >> >> Adam > > >> >> > > >> >> On Apr 13, 2014, at 1:50 PM, Adam Kocoloski < > > [email protected]> > > >> >> wrote: > > >> >> > > >> >>> Yeah, this is a subtle little thing. The main reason we checkpoint > > on > > >> >> both source and target and compare is to cover the case where the > > >> source > > >> >> database is deleted and recreated in between replication attempts. > If > > >> that > > >> >> were to happen and the replicator just resumes blindly from the > > >> checkpoint > > >> >> sequence stored on the target then the replication could > permanently > > >> miss > > >> >> some documents written to the new source. > > >> >>> > > >> >>> I'd love to have a robust solution for incremental replication of > > >> >> read-only databases. To first order a UUID on the source database > > that > > >> was > > >> >> fixed at create time could do the trick, but we'll run into trouble > > >> with > > >> >> file-based backup and restores. If a database file is restored to a > > >> point > > >> >> before the latest replication checkpoint we'd again be in a > position > > of > > >> >> potentially permanently missing updates. > > >> >>> > > >> >>> Calvin's suggestion of storing e.g. {seq, id, rev} instead of > simply > > >> seq > > >> >> as the checkpoint information would dramatically reduce the > > likelihood > > >> of > > >> >> that type of permanent skip in the replication, but it's only a > > >> >> probabilistic answer. > > >> >>> > > >> >>> Adam > > >> >>> > > >> >>>> On Apr 13, 2014, at 1:31 PM, Calvin Metcalf < > > >> [email protected]> > > >> >>> wrote: > > >> >>> > > >> >>>> Though currently we have the opposite problem right if we delete > > the > > >> >> target > > >> >>>> db? (this on me brain storming) > > >> >>>> > > >> >>>> Could we store last rev in addition to last seq? > > >> >>>>> On Apr 13, 2014 1:15 PM, "Dale Harvey" <[email protected]> > > wrote: > > >> >>>>> > > >> >>>>> If the src database was to be wiped, when we restarted > replication > > >> >> nothing > > >> >>>>> would happen until the source database caught up to the > previously > > >> >> written > > >> >>>>> checkpoint > > >> >>>>> > > >> >>>>> create A, write 5 documents > > >> >>>>> replicate 5 documents A -> B, write checkpoint 5 on B > > >> >>>>> destroy A > > >> >>>>> write 4 documents > > >> >>>>> replicate A -> B, pick up checkpoint from B and to ?since=5 > > >> >>>>> .. no documents written > > >> >> > > >> > > > https://github.com/pouchdb/pouchdb/blob/master/tests/test.replication.js#L771is > > >> >>>>> our test that covers it > > >> >>>>> > > >> >>>>> > > >> >>>>> On 13 April 2014 18:02, Calvin Metcalf < > [email protected]> > > >> >> wrote: > > >> >>>>> > > >> >>>>>> If we were to unilaterally switch to checkpoint on target what > > >> would > > >> >>>>>> happen, replication in progress would loose their place? > > >> >>>>>>> On Apr 13, 2014 11:21 AM, "Dale Harvey" <[email protected]> > > >> wrote: > > >> >>>>>>> > > >> >>>>>>> So with checkpointing we write the checkpoint to both A and B > > and > > >> >>>>> verify > > >> >>>>>>> they match before using the checkpoint > > >> >>>>>>> > > >> >>>>>>> What happens if the src of the replication is read only? > > >> >>>>>>> > > >> >>>>>>> As far as I can tell couch will just checkout a > > >> >> checkpoint_commit_error > > >> >>>>>> and > > >> >>>>>>> carry on from the start, The only improvement I can think of > is > > >> the > > >> >>>>> user > > >> >>>>>>> specifies they know the src is read only and to only use the > > >> target > > >> >>>>>>> checkpoint, we can 'possibly' make that happen automatically > if > > >> the > > >> >> src > > >> >>>>>>> specifically fails the write due to permissions. > > >> >> > > >> >> > > >> > > > > > > > > > > > > -- > > > -Calvin W. Metcalf > > > > > > > > > > > -- > > -Calvin W. Metcalf > > > -- -Calvin W. Metcalf
