oo didn't think of that, yeah uuids wouldn't hurt, though the more I think about the rolling hashing on revs, the more I like that
On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski <[email protected]>wrote: > Yes, but then sysadmins have to be very very careful about restoring from > a file-based backup. We run the risk that {uuid, seq} could be > multi-valued, which diminishes its value considerably. > > I like the UUID in general -- we've added them to our internal shard files > at Cloudant -- but on their own they're not a bulletproof solution for > read-only incremental replications. > > Adam > > > On Apr 13, 2014, at 5:16 PM, Calvin Metcalf <[email protected]> > wrote: > > > > I mean if your going to add new features to couch you could just have the > > db generate a random uuid on creation that would be different if it was > > deleted and recreated > >> On Apr 13, 2014 1:59 PM, "Adam Kocoloski" <[email protected]> > wrote: > >> > >> Other thoughts: > >> > >> - We could enhance the authorization system to have a role that allows > >> updates to _local docs but nothing else. It wouldn't make sense for > >> completely untrusted peers, but it could give peace of mind to sysadmins > >> trying to execute replications with the minimum level of access > possible. > >> > >> - We could teach the sequence index to maintain a report of rolling hash > >> of the {id,rev} pairs that comprise the database up to that sequence, > >> record that in the replication checkpoint document, and check that it's > >> unchanged on resume. It's a new API enhancement and it grows the amount > of > >> information stored with each sequence, but it completely closes off the > >> probabilistic edge case associated with simply checking that the {id, > rev} > >> associated with the checkpointed sequence has not changed. Perhaps > overkill > >> for what is admittedly a pretty low-probability event. > >> > >> Adam > >> > >> On Apr 13, 2014, at 1:50 PM, Adam Kocoloski <[email protected]> > >> wrote: > >> > >>> Yeah, this is a subtle little thing. The main reason we checkpoint on > >> both source and target and compare is to cover the case where the source > >> database is deleted and recreated in between replication attempts. If > that > >> were to happen and the replicator just resumes blindly from the > checkpoint > >> sequence stored on the target then the replication could permanently > miss > >> some documents written to the new source. > >>> > >>> I'd love to have a robust solution for incremental replication of > >> read-only databases. To first order a UUID on the source database that > was > >> fixed at create time could do the trick, but we'll run into trouble with > >> file-based backup and restores. If a database file is restored to a > point > >> before the latest replication checkpoint we'd again be in a position of > >> potentially permanently missing updates. > >>> > >>> Calvin's suggestion of storing e.g. {seq, id, rev} instead of simply > seq > >> as the checkpoint information would dramatically reduce the likelihood > of > >> that type of permanent skip in the replication, but it's only a > >> probabilistic answer. > >>> > >>> Adam > >>> > >>>> On Apr 13, 2014, at 1:31 PM, Calvin Metcalf <[email protected] > > > >>> wrote: > >>> > >>>> Though currently we have the opposite problem right if we delete the > >> target > >>>> db? (this on me brain storming) > >>>> > >>>> Could we store last rev in addition to last seq? > >>>>> On Apr 13, 2014 1:15 PM, "Dale Harvey" <[email protected]> wrote: > >>>>> > >>>>> If the src database was to be wiped, when we restarted replication > >> nothing > >>>>> would happen until the source database caught up to the previously > >> written > >>>>> checkpoint > >>>>> > >>>>> create A, write 5 documents > >>>>> replicate 5 documents A -> B, write checkpoint 5 on B > >>>>> destroy A > >>>>> write 4 documents > >>>>> replicate A -> B, pick up checkpoint from B and to ?since=5 > >>>>> .. no documents written > >> > https://github.com/pouchdb/pouchdb/blob/master/tests/test.replication.js#L771is > >>>>> our test that covers it > >>>>> > >>>>> > >>>>> On 13 April 2014 18:02, Calvin Metcalf <[email protected]> > >> wrote: > >>>>> > >>>>>> If we were to unilaterally switch to checkpoint on target what would > >>>>>> happen, replication in progress would loose their place? > >>>>>>> On Apr 13, 2014 11:21 AM, "Dale Harvey" <[email protected]> > wrote: > >>>>>>> > >>>>>>> So with checkpointing we write the checkpoint to both A and B and > >>>>> verify > >>>>>>> they match before using the checkpoint > >>>>>>> > >>>>>>> What happens if the src of the replication is read only? > >>>>>>> > >>>>>>> As far as I can tell couch will just checkout a > >> checkpoint_commit_error > >>>>>> and > >>>>>>> carry on from the start, The only improvement I can think of is the > >>>>> user > >>>>>>> specifies they know the src is read only and to only use the target > >>>>>>> checkpoint, we can 'possibly' make that happen automatically if the > >> src > >>>>>>> specifically fails the write due to permissions. > >> > >> > -- -Calvin W. Metcalf
