Alex,

The first con I see for that approach is that its not soft-deletion.
Its actual deletion with an API for restoration. Which, fair enough,
is probably a feature we should consider supporting for CouchDB
installations that are based on FoundationDB.

The second major con is that it relies on CouchDB being based on
FoundationDB. Part of CouchDB's design philosophy is that the internet
may or may not exist, and if it does exist that it may or may not be
reliable. There are lots of deployments of CouchDB that are part of a
desktop application or POS installation that may see internet only
periodically if at all so an S3 backup solution is out. There also may
come a time that there's a flavor of CouchDB that uses LevelDB or
SQLite or FDBLite (I just made that up, any idea how hard it'd be?)
for these sorts of embedded deployments where fdbrestore/fdbbackup
wouldn't be feasible.

Then the last major con I see is the time-to-restore disparity. With
soft-deletion restoration is a few milliseconds. Streaming from S3
will obviously depend on the size of the database and obviously be
orders of magnitude longer.

On the pro side for the soft-delete on FoundationDB is that the first
draft of the RFC is 108 lines [1]. We obviously can't say for sure how
big or involved the fdbrestore approach would be but I think we'd all
agree it'd be bigger.

Paul

[1] https://github.com/apache/couchdb/pull/2666


On Wed, Mar 18, 2020 at 2:31 PM Alex Miller
<alexmil...@apple.com.invalid> wrote:
>
> Let me perhaps paint an alternative RFC:
>
> 1) `DELETE /{db}`
>
> If soft-deletion is enabled, delete the database subspace, and also record 
> into ?DELETED_DBS the timestamp of the commit and the database subspace prefix
>
> 2) `GET /{db}/_deleted_dbs_info`
>
> Return the timestamp (and whatever other info one should record) of deleted 
> databases.
>
> 3) `PUT /{db}/_restore/{deletedTS}`
>
> Invoke `fdbrestore -k` to do a key range restricted restore into the current 
> cluster of the deleted subspace prefix at versionstamp-1.  Wait for it to 
> complete, and return 200 when completed.
>
> And this would all rely on having a continuous backup configured and running 
> that would hold a minimum of 48 hours of changes.
>
>
> Now, I don’t actually deal with backups often so my memory on current caveats 
> is a bit fuzzy.  I think there might be a couple complications here that I’ve 
> missed, like…
> * There not being key range restricted locking of the database
> * A key range restore is currently suboptimal in that it doesn’t do obvious 
> filtering that it could to cut down on the amount of data it reads
>
> But, neither of these seem heavily blocking, as they could be tackled 
> quickly, particularly if you leverage some upstream relationships ;).  Backup 
> and restore has been the general answer to accidental data deletion (or 
> corruption) on FDB, and I could paint some attractive looking pros of this 
> approach: backup files are more disk space efficient, soft deleted data could 
> be offloaded to an S3-compatible store, it would be free if FDB is already 
> configured to take backups.  I was just curious to hear a bit more detail on 
> your/Peng’s side of the reasons for preferring to build soft deletion on top 
> of FDB (and thus have also intentionally withheld more of the cons of this 
> approach, or the pros of yours).
>
> > On Mar 18, 2020, at 11:59, Paul Davis <paul.joseph.da...@gmail.com> wrote:
> >
> > Alex,
> >
> > All joking aside, soft-deletion's target use case is accidental
> > deletions. This isn't a replacement for backup/restore which will
> > still happen for all the usual reasons.
> >
> > Paul
> >
> > On Wed, Mar 18, 2020 at 1:42 PM Paul Davis <paul.joseph.da...@gmail.com> 
> > wrote:
> >>
> >> On Wed, Mar 18, 2020 at 1:29 PM Alex Miller
> >> <alexmil...@apple.com.invalid> wrote:
> >>>
> >>>
> >>>> On Mar 18, 2020, at 05:04, jiangph <jiangpeng...@hotmail.com> wrote:
> >>>>
> >>>> Instead of automatically and immediately removing data and index in 
> >>>> database after a delete operation, soft-deletion allows to restore the 
> >>>> deleted data back to original state due to a “fat finger”or undesired 
> >>>> delete operation, up to defined periods, such as 48 hours.
> >>>>
> >>>> In CouchDB 3.0, soft-deletion of database is implemented in [1]. The 
> >>>> .couch file is renamed with the .<timestamp>.deleted.couch file after 
> >>>> soft-deletion is enabled, and such file can be changed back to .couch 
> >>>> for the purpose of restore. If restore is not needed and some specified 
> >>>> period passed, the .<timestamp>.deleted.couch file can be deleted to 
> >>>> achieve deletion of database permanently.
> >>>>
> >>>> In CouchDB 4.0, with the introduction of FoundationDB, the data model 
> >>>> and storage is changed. In order to support soft-deletion, we propose 
> >>>> below solution and then implement them.
> >>>
> >>>
> >>>
> >>> I’ve sort of hand waved some answers to this in my head, but would you 
> >>> mind expanding a bit on the advantages of keeping soft-deleted data in 
> >>> FoundationDB as opposed to actually deleting it and relying on 
> >>> FoundationDB’s backup and restore to recover it if needed?
> >>
> >> From: Panicked User
> >> To: Customer Support
> >> Subject: URGENT! EMERGENCY DATABASE RESTORE!
> >>
> >> Dear,
> >>
> >> I have accidentally deleted my Very Important Database and need to
> >> have it restored ASAP! Without this mission critical database my
> >> company is completely offline which is costing $1B an hour!!!!!
> >>
> >> Please respond ASAP!
> >>
> >> Sincerely,
> >> Panicky McPanics
>

Reply via email to