except there is no way to calculate that from outside the database as changes only ever gives the more recent document version.
On Sun, Apr 13, 2014 at 9:47 PM, Calvin Metcalf <[email protected]>wrote: > oo didn't think of that, yeah uuids wouldn't hurt, though the more I think > about the rolling hashing on revs, the more I like that > > > On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski > <[email protected]>wrote: > >> Yes, but then sysadmins have to be very very careful about restoring from >> a file-based backup. We run the risk that {uuid, seq} could be >> multi-valued, which diminishes its value considerably. >> >> I like the UUID in general -- we've added them to our internal shard >> files at Cloudant -- but on their own they're not a bulletproof solution >> for read-only incremental replications. >> >> Adam >> >> > On Apr 13, 2014, at 5:16 PM, Calvin Metcalf <[email protected]> >> wrote: >> > >> > I mean if your going to add new features to couch you could just have >> the >> > db generate a random uuid on creation that would be different if it was >> > deleted and recreated >> >> On Apr 13, 2014 1:59 PM, "Adam Kocoloski" <[email protected]> >> wrote: >> >> >> >> Other thoughts: >> >> >> >> - We could enhance the authorization system to have a role that allows >> >> updates to _local docs but nothing else. It wouldn't make sense for >> >> completely untrusted peers, but it could give peace of mind to >> sysadmins >> >> trying to execute replications with the minimum level of access >> possible. >> >> >> >> - We could teach the sequence index to maintain a report of rolling >> hash >> >> of the {id,rev} pairs that comprise the database up to that sequence, >> >> record that in the replication checkpoint document, and check that it's >> >> unchanged on resume. It's a new API enhancement and it grows the >> amount of >> >> information stored with each sequence, but it completely closes off the >> >> probabilistic edge case associated with simply checking that the {id, >> rev} >> >> associated with the checkpointed sequence has not changed. Perhaps >> overkill >> >> for what is admittedly a pretty low-probability event. >> >> >> >> Adam >> >> >> >> On Apr 13, 2014, at 1:50 PM, Adam Kocoloski <[email protected]> >> >> wrote: >> >> >> >>> Yeah, this is a subtle little thing. The main reason we checkpoint on >> >> both source and target and compare is to cover the case where the >> source >> >> database is deleted and recreated in between replication attempts. If >> that >> >> were to happen and the replicator just resumes blindly from the >> checkpoint >> >> sequence stored on the target then the replication could permanently >> miss >> >> some documents written to the new source. >> >>> >> >>> I'd love to have a robust solution for incremental replication of >> >> read-only databases. To first order a UUID on the source database that >> was >> >> fixed at create time could do the trick, but we'll run into trouble >> with >> >> file-based backup and restores. If a database file is restored to a >> point >> >> before the latest replication checkpoint we'd again be in a position of >> >> potentially permanently missing updates. >> >>> >> >>> Calvin's suggestion of storing e.g. {seq, id, rev} instead of simply >> seq >> >> as the checkpoint information would dramatically reduce the likelihood >> of >> >> that type of permanent skip in the replication, but it's only a >> >> probabilistic answer. >> >>> >> >>> Adam >> >>> >> >>>> On Apr 13, 2014, at 1:31 PM, Calvin Metcalf < >> [email protected]> >> >>> wrote: >> >>> >> >>>> Though currently we have the opposite problem right if we delete the >> >> target >> >>>> db? (this on me brain storming) >> >>>> >> >>>> Could we store last rev in addition to last seq? >> >>>>> On Apr 13, 2014 1:15 PM, "Dale Harvey" <[email protected]> wrote: >> >>>>> >> >>>>> If the src database was to be wiped, when we restarted replication >> >> nothing >> >>>>> would happen until the source database caught up to the previously >> >> written >> >>>>> checkpoint >> >>>>> >> >>>>> create A, write 5 documents >> >>>>> replicate 5 documents A -> B, write checkpoint 5 on B >> >>>>> destroy A >> >>>>> write 4 documents >> >>>>> replicate A -> B, pick up checkpoint from B and to ?since=5 >> >>>>> .. no documents written >> >> >> https://github.com/pouchdb/pouchdb/blob/master/tests/test.replication.js#L771is >> >>>>> our test that covers it >> >>>>> >> >>>>> >> >>>>> On 13 April 2014 18:02, Calvin Metcalf <[email protected]> >> >> wrote: >> >>>>> >> >>>>>> If we were to unilaterally switch to checkpoint on target what >> would >> >>>>>> happen, replication in progress would loose their place? >> >>>>>>> On Apr 13, 2014 11:21 AM, "Dale Harvey" <[email protected]> >> wrote: >> >>>>>>> >> >>>>>>> So with checkpointing we write the checkpoint to both A and B and >> >>>>> verify >> >>>>>>> they match before using the checkpoint >> >>>>>>> >> >>>>>>> What happens if the src of the replication is read only? >> >>>>>>> >> >>>>>>> As far as I can tell couch will just checkout a >> >> checkpoint_commit_error >> >>>>>> and >> >>>>>>> carry on from the start, The only improvement I can think of is >> the >> >>>>> user >> >>>>>>> specifies they know the src is read only and to only use the >> target >> >>>>>>> checkpoint, we can 'possibly' make that happen automatically if >> the >> >> src >> >>>>>>> specifically fails the write due to permissions. >> >> >> >> >> > > > > -- > -Calvin W. Metcalf > -- -Calvin W. Metcalf
