On 12/11/2008, at 7:02 AM, Dean Landolt wrote:

On Tue, Nov 11, 2008 at 3:08 PM, Paul Davis <[EMAIL PROTECTED] >wrote:

I think this is an interesting idea, and has mostly been done with
client libraries. ATM, I'm leaning towards saying that this is a
client extension and doesn't really belong in couch. There are a crap
load of optimizations that clients could make that couch couldn't.

I have some ideas running around in my head about doing object graph
loading etc. Things really start to get fun on the client when you
contemplate referencing other databases etc.

Anyway, if you can come up with some part of this functionality that
*must* be done server side and has a big enough use case, ideas for
patches are always welcome :D


I didn't think of it that way, but I agree. Perhaps a querytools plugin would be in order when the plugin system lands, but this is probably best left to the client. Does anybody know if jquery.couch does something like
this? If not, I'll have a go at hacking it in.

I've written a plugin (in Erlang) that allows joins by tracking updates and replicating keys to Mnesia. It has the same behaviour as the existing map-reduce views e.g. updates only on view request. You can just as easily use e.g. SQLite rather than Mnesia.

I did this to allow arbitrary join queries against my model without doing it on the client, and to centralise caching for high-performance joins such as transitive User->Permission checks when using a User/ Role/Permission model that doesn't want both relationships to be stored on the Role object.

I can separately compile and deploy this using the simple change to bin/couchdb that I described in an earlier post. It involves no change to the rest of CouchDB.

*** Suggestion ***

Now that I've done it, I realize that it's overkill, and I'm abandoning that approach, not only because very few people are going to want to do this in Erlang, but also because the same effect can be gained using _external. The single optimization required is for a request coming through _external to carry the db seqno for the request (presuming the _external request is qualified by a db). This allows you to easily avoid updating your external index, without making any request if there have been no changes since the last request. This is important because it's a performance hit if you have to make an additional HTTP request to CouchDB on every view request.

You should also use both a startkey and an endkey in the _all_docs_by_seq request, which in effect gives the same semantics as map-reduce views i.e. you don't get into any race conditions during updating because you see one particular MVCC snapshot. You can synthesize this approach by simultaneously using a notification listener and arranging for that process to talk to your _external handler, but given that the Erlang endpoint has easy access to the seqno, why not supply it to the external process, and avoid the hassle. Listening for notifications is necessary if you are going to use something like memcached, or some query mechanism that doesn't go through CouchDB (and you want to avoid requesting the db seqno on every request).

Maybe a different external that enforces a db qualifier, such as _external_view, would be appropriate. There are further optimisations I considered, such as enriching the _external protocol to allow the external process to perform the _all_docs_by_seq request (and subsequent document GETs) over the port, but on reflection the minimal change is preferable.

Finally, by not doing this in Erlang it is easier to conceive of a mechanism that allows you to deploy the _external handler via CouchDB itself. Doing the plugin in Erlang introduces the Subject/Object problem, which will it can be overcome, is a lot more fraught IMO. Paul suggested this for Erlang plugins, but I think it's more applicable to this scenario.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Did you hear about the Buddhist who refused Novocain during a root canal?
His goal: transcend dental medication.


Reply via email to