On 31 Jul 2009, at 14:24, Jason Davies wrote:
Hi all,
I've been discussing adding history support to CouchDB with Jan.
I've been collecting our thoughts here: http://wiki.apache.org/couchdb/History
Probably the #1 misconception for newcomers is that the "_rev"
member in documents should be used for version control i.e. giving
clients the ability "roll back" in time and view a document's state
at some point in the past. We all know that when compaction occurs
old revisions of documents are effectively deleted hence we tell
people off whenever they suggest that the MVCC revisions concept is
just like git and can be used as a VCS. We usually recommend people
roll their own system on top of CouchDB, which would involve writing
a dbupdatenotification handler to listen for changes and push them
into a separate db.
Many would still love to have built-in support for history in
CouchDB, and for good reason, as most applications would benefit
hugely from being able to support "undo" for free to prevent
potentially catastrophic data loss if someone presses the wrong
button.
The main points of this proposal are:
1. Store the historical versions of documents in a separate
database. This is for a number of reasons: a) keeping it separate
means we don't clog up the main database with historical data b)
history-specific views can be kept here c) non-intrusive
implementation of this is easier.
2. The change will be made at the couch_db layer so that *any*
change to any document in the target database will be mirrored to
the history database.
3. Each and every change to a document will result in a new document
being created in the history database (with a new ID) containing an
exact copy of that document e.g. {_id: <new ID>, doc: <exact copy of
doc> }.
4. Adding meta-data to changes can be handled by a custom _update
handler (yet to be developed) to set fields such as "last_modified"
and "last_modified_user".
One use case we'd like to support is effectively (from the point of
the user) being able to "roll back" a view to a specific point in
time, but how this would look in the history database has me stumped
so far. Rolling back a specific doc is easy, but multiple docs, not
so easy it seems. Any suggestions welcome!
I'd argue that if you need database snapshots, you replicate to a new
database for every snapshot. The auto-history is more meant for undo
of potentially disastrous actions.
Cheers
Jan
--