Interesting proposal. A question:
First, I'm a little confused by the use of the phrase referential
transparency as I understand the more technical definition in the FP
literature (call a function twice on the same input and get the same
output), but I think I see the intended meaning, please clarify if
you have another meaning.
Do you have applications and/or envision use cases that are dynamic
enough that you want to track design doc changes? It seems to me
these are more a development time concern.
On Jan 5, 2009, at 8:26 AM, Antony Blakey wrote:
I've cc'd this to couchdb-user, because I think this discussion
belongs on -dev, but everyone watches -user.
One of the great features of Couch is the use of optimistic locking
i.e. rev as a bedrock mechanism, and the way this is permeated
through the API. The combination of id + rev is a reference to an
immutable value (with some caveats, one subject of this proposal).
This means that you get caching for free. By keying off id + rev,
you can cache the document along with any (functional) derived
values. Additionally you can trivially memoize functions of
multiple documents using that mechanism.
I use this to good effect in my application, where I aggressively
cache the documents (which are sometimes large) and therefore don't
need the document content in queries. To take advantage of this
however this means that my views need to include the _rev as the
value, and transformation that would normally happen in the map
happens in the client.
It would be very useful to have the rev returned wherever an id is
returned, specifically in view results. You could then use a view
without include_docs, and get the ids and revs. You can keep a
cache (per view, pre db) of the results. The actual view results
only need to be fetched on a cache miss, which can be driven by the
cache machinery.
The nice thing is that all of this caching machinery can be
transparently interposed. Except when the view definition is
changed. So I also propose to have the rev and id of the design doc
returned in the view results. And for completeness, every database
should be assigned a UUID when it is created. This UUID should be
provided in the dbinfo, and for every view and view-like result.
This means that from every view result you can construct a list of
universally unique references to immutable values e.g. DB UUID +
(View id + rev) + (Document id + rev). A form of referential
transparency - and with a cache and a little bit of 100% generic
machinery, it can be true referential transparency. Clients don't
have to watch/be notified about changes to design docs, or even
database creation/deletion. Systemwide transparent caching in
particular becomes trivial.
So, in summary I propose:
1. Provide the document rev whenever the id is returned, such as
view results i.e. not in the document, but in the per-row hash.
2. Provide the design document id and rev in view results i.e. in
the top level hash.
3. Add a UUID to databases, and provide that in view results i.e.
in the top level hash, and all other database operation results.
I think you could do this even with reduce results, but I haven't
though a lot about it.
I think this generalised the current API in a very useful way, that
will greatly simplify, and hence 'robustify' client code. Although
I haven't checked the implementation code, my experience so far
suggest this isn't difficult.
Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
Every task involves constraint,
Solve the thing without complaint;
There are magic links and chains
Forged to loose our rigid brains.
Structures, structures, though they bind,
Strangely liberate the mind.
-- James Fallen