Re: Checkpointing on read only databases

Adam Kocoloski Wed, 23 Apr 2014 10:28:46 -0700

No, that one just keeps the last N. For internal replication checkpoints in a 
cluster Paul and I worked out something a bit smarter and more subtle:


https://github.com/cloudant/mem3/blob/master/src/mem3_rpc.erl#L152-L201

Cheers, Adam

On Apr 23, 2014, at 11:58 AM, Calvin Metcalf <[email protected]> wrote:

> this function ?
> https://github.com/cloudant/bigcouch/blob/master/apps/couch/src/couch_rep.erl#L687-L781
> 
> 
> On Wed, Apr 23, 2014 at 11:35 AM, Adam Kocoloski
> <[email protected]>wrote:
> 
>> There's an algorithm in the BigCouch codebase for storing up to N
>> checkpoints with exponentially increasing granularity (in terms of sequence
>> values) between them. It strikes a nice balance between checkpoint document
>> size and ability to resume with minimal replay.
>> 
>> Adam
>> 
>>> On Apr 23, 2014, at 11:28 AM, Calvin Metcalf <[email protected]>
>> wrote:
>>> 
>>> with the rolling hash thingy, a checkpoint document could store more then
>>> one database hash, e.g. the last 5 but totally up to whoever is storing
>>> the checkpoint.  This would cover the case where you stop the replication
>>> after one of the dbs has stored the checkpoint but before the other one
>>> has.
>>> 
>>> 
>>>> On Tue, Apr 15, 2014 at 9:21 PM, Dale Harvey <[email protected]>
>> wrote:
>>>> 
>>>> ah, yeh got it now, cheers
>>>> 
>>>> 
>>>>> On 16 April 2014 02:17, Calvin Metcalf <[email protected]>
>> wrote:
>>>>> 
>>>>> Your source data base is upto seq 10, but the box its on catches fire.
>>>> You
>>>>> have a backup though but its at seq 8, same UUID though but you'll miss
>>>> the
>>>>> next 2 seqs.
>>>>>> On Apr 15, 2014 8:57 PM, "Dale Harvey" <[email protected]> wrote:
>>>>>> 
>>>>>> Sorry still dont understand the problem here
>>>>>> 
>>>>>> The uuid is stored inside the database file, you either have the same
>>>>> data
>>>>>> and the same uuid, or none of them?
>>>>>> 
>>>>>> 
>>>>>> On 15 April 2014 19:54, Calvin Metcalf <[email protected]>
>>>> wrote:
>>>>>> 
>>>>>>> I think the problem is not as much deleting and recreating a database
>>>>> but
>>>>>>> wiping a virtual machine and restoring from a backup, now you have
>>>> more
>>>>>> or
>>>>>>> less gone back in time with the target database and it has different
>>>>>> stuff
>>>>>>> but the same uuid.
>>>>>>> 
>>>>>>> 
>>>>>>>> On Tue, Apr 15, 2014 at 2:32 PM, Dale Harvey <[email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I dont understand the problem with per db uuids, so the uuid isnt
>>>>>>>> multivalued nor is it queried
>>>>>>>> 
>>>>>>>>  A is readyonly, B is client, B starts replication from A
>>>>>>>>  B reads the db uuid from A / itself, generates a replication_id,
>>>>>>> stores
>>>>>>>> on B
>>>>>>>>  try to fetch replication checkpoint, if successful we query
>>>>> changes
>>>>>>> from
>>>>>>>> since?
>>>>>>>> 
>>>>>>>> In pouch we store the uuid along with the data, so file based
>>>> backups
>>>>>>> arent
>>>>>>>> a problem, seems couchdb could / should do that too
>>>>>>>> 
>>>>>>>> This also fixes the problem mentioned on the mailing list, and one
>>>> I
>>>>>> have
>>>>>>>> run into personally where people forward db requests but not server
>>>>>>>> requests via a proxy
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 15 April 2014 19:18, Calvin Metcalf <[email protected]>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> except there is no way to calculate that from outside the
>>>> database
>>>>> as
>>>>>>>>> changes only ever gives the more recent document version.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Sun, Apr 13, 2014 at 9:47 PM, Calvin Metcalf <
>>>>>>>> [email protected]
>>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> oo didn't think of that, yeah uuids wouldn't hurt, though the
>>>>> more
>>>>>> I
>>>>>>>>> think
>>>>>>>>>> about the rolling hashing on revs, the more I like that
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski <
>>>>>>>>> [email protected]>wrote:
>>>>>>>>>> 
>>>>>>>>>>> Yes, but then sysadmins have to be very very careful about
>>>>>> restoring
>>>>>>>>> from
>>>>>>>>>>> a file-based backup. We run the risk that {uuid, seq} could be
>>>>>>>>>>> multi-valued, which diminishes its value considerably.
>>>>>>>>>>> 
>>>>>>>>>>> I like the UUID in general -- we've added them to our internal
>>>>>> shard
>>>>>>>>>>> files at Cloudant -- but on their own they're not a
>>>> bulletproof
>>>>>>>> solution
>>>>>>>>>>> for read-only incremental replications.
>>>>>>>>>>> 
>>>>>>>>>>> Adam
>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 13, 2014, at 5:16 PM, Calvin Metcalf <
>>>>>>>> [email protected]
>>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I mean if your going to add new features to couch you could
>>>>> just
>>>>>>>> have
>>>>>>>>>>> the
>>>>>>>>>>>> db generate a random uuid on creation that would be
>>>> different
>>>>> if
>>>>>>> it
>>>>>>>>> was
>>>>>>>>>>>> deleted and recreated
>>>>>>>>>>>>> On Apr 13, 2014 1:59 PM, "Adam Kocoloski" <
>>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Other thoughts:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - We could enhance the authorization system to have a role
>>>>> that
>>>>>>>>> allows
>>>>>>>>>>>>> updates to _local docs but nothing else. It wouldn't make
>>>>> sense
>>>>>>> for
>>>>>>>>>>>>> completely untrusted peers, but it could give peace of mind
>>>>> to
>>>>>>>>>>> sysadmins
>>>>>>>>>>>>> trying to execute replications with the minimum level of
>>>>> access
>>>>>>>>>>> possible.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - We could teach the sequence index to maintain a report of
>>>>>>> rolling
>>>>>>>>>>> hash
>>>>>>>>>>>>> of the {id,rev} pairs that comprise the database up to that
>>>>>>>> sequence,
>>>>>>>>>>>>> record that in the replication checkpoint document, and
>>>> check
>>>>>>> that
>>>>>>>>> it's
>>>>>>>>>>>>> unchanged on resume. It's a new API enhancement and it
>>>> grows
>>>>>> the
>>>>>>>>>>> amount of
>>>>>>>>>>>>> information stored with each sequence, but it completely
>>>>> closes
>>>>>>> off
>>>>>>>>> the
>>>>>>>>>>>>> probabilistic edge case associated with simply checking
>>>> that
>>>>>> the
>>>>>>>> {id,
>>>>>>>>>>> rev}
>>>>>>>>>>>>> associated with the checkpointed sequence has not changed.
>>>>>>> Perhaps
>>>>>>>>>>> overkill
>>>>>>>>>>>>> for what is admittedly a pretty low-probability event.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Adam
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Apr 13, 2014, at 1:50 PM, Adam Kocoloski <
>>>>>>>>> [email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yeah, this is a subtle little thing. The main reason we
>>>>>>> checkpoint
>>>>>>>>> on
>>>>>>>>>>>>> both source and target and compare is to cover the case
>>>> where
>>>>>> the
>>>>>>>>>>> source
>>>>>>>>>>>>> database is deleted and recreated in between replication
>>>>>>> attempts.
>>>>>>>> If
>>>>>>>>>>> that
>>>>>>>>>>>>> were to happen and the replicator just resumes blindly from
>>>>> the
>>>>>>>>>>> checkpoint
>>>>>>>>>>>>> sequence stored on the target then the replication could
>>>>>>>> permanently
>>>>>>>>>>> miss
>>>>>>>>>>>>> some documents written to the new source.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'd love to have a robust solution for incremental
>>>>> replication
>>>>>>> of
>>>>>>>>>>>>> read-only databases. To first order a UUID on the source
>>>>>> database
>>>>>>>>> that
>>>>>>>>>>> was
>>>>>>>>>>>>> fixed at create time could do the trick, but we'll run into
>>>>>>> trouble
>>>>>>>>>>> with
>>>>>>>>>>>>> file-based backup and restores. If a database file is
>>>>> restored
>>>>>>> to a
>>>>>>>>>>> point
>>>>>>>>>>>>> before the latest replication checkpoint we'd again be in a
>>>>>>>> position
>>>>>>>>> of
>>>>>>>>>>>>> potentially permanently missing updates.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Calvin's suggestion of storing e.g. {seq, id, rev} instead
>>>>> of
>>>>>>>> simply
>>>>>>>>>>> seq
>>>>>>>>>>>>> as the checkpoint information would dramatically reduce the
>>>>>>>>> likelihood
>>>>>>>>>>> of
>>>>>>>>>>>>> that type of permanent skip in the replication, but it's
>>>>> only a
>>>>>>>>>>>>> probabilistic answer.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Adam
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Apr 13, 2014, at 1:31 PM, Calvin Metcalf <
>>>>>>>>>>> [email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Though currently we have the opposite problem right if we
>>>>>>> delete
>>>>>>>>> the
>>>>>>>>>>>>> target
>>>>>>>>>>>>>>> db? (this on me brain storming)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Could we store last rev in addition to last seq?
>>>>>>>>>>>>>>>> On Apr 13, 2014 1:15 PM, "Dale Harvey" <
>>>>> [email protected]
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> If the src database was to be wiped, when we restarted
>>>>>>>> replication
>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>> would happen until the source database caught up to the
>>>>>>>> previously
>>>>>>>>>>>>> written
>>>>>>>>>>>>>>>> checkpoint
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> create A, write 5 documents
>>>>>>>>>>>>>>>> replicate 5 documents A -> B, write checkpoint 5 on B
>>>>>>>>>>>>>>>> destroy A
>>>>>>>>>>>>>>>> write 4 documents
>>>>>>>>>>>>>>>> replicate A -> B, pick up checkpoint from B and to
>>>>> ?since=5
>>>>>>>>>>>>>>>> .. no documents written
>>>> 
>> https://github.com/pouchdb/pouchdb/blob/master/tests/test.replication.js#L771is
>>>>>>>>>>>>>>>> our test that covers it
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 13 April 2014 18:02, Calvin Metcalf <
>>>>>>>> [email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> If we were to unilaterally switch to checkpoint on
>>>> target
>>>>>>> what
>>>>>>>>>>> would
>>>>>>>>>>>>>>>>> happen, replication in progress would loose their
>>>> place?
>>>>>>>>>>>>>>>>>> On Apr 13, 2014 11:21 AM, "Dale Harvey" <
>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> So with checkpointing we write the checkpoint to both
>>>> A
>>>>>> and
>>>>>>> B
>>>>>>>>> and
>>>>>>>>>>>>>>>> verify
>>>>>>>>>>>>>>>>>> they match before using the checkpoint
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> What happens if the src of the replication is read
>>>> only?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> As far as I can tell couch will just checkout a
>>>>>>>>>>>>> checkpoint_commit_error
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> carry on from the start, The only improvement I can
>>>>> think
>>>>>> of
>>>>>>>> is
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>> specifies they know the src is read only and to only
>>>> use
>>>>>> the
>>>>>>>>>>> target
>>>>>>>>>>>>>>>>>> checkpoint, we can 'possibly' make that happen
>>>>>> automatically
>>>>>>>> if
>>>>>>>>>>> the
>>>>>>>>>>>>> src
>>>>>>>>>>>>>>>>>> specifically fails the write due to permissions.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> -Calvin W. Metcalf
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> -Calvin W. Metcalf
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> -Calvin W. Metcalf
>>> 
>>> 
>>> 
>>> --
>>> -Calvin W. Metcalf
>> 
> 
> 
> 
> -- 
> -Calvin W. Metcalf

Re: Checkpointing on read only databases

Reply via email to