On Apr 23, 2009, at 2:52 AM, Paul Davis wrote:
I'm still kicking around ideas on how I might approach such an implementation. There are some obvious downfalls that Wout has come up on. Specifically, some obvious implementations would place some arduous constraints on emitted keys (namely: uniqueness, type must be string, string must not start with an underscore).
I really would prefer the review DB to be, in fact, a view index, but since I don't know the code, I was trying to provide the ability to go either way.
As for the uniqueness, I don't know if that is such a heavy constraint. After all, it is M views that can emit multiple non-unique KV pairs. There is no value in running a second M(R) after M, since you can write any map_b(map_a()) as a new map_c(). Likewise the results of map_a()+map_b() can be obtained by a map_c() function.
Therefore, review DBs should only contain the results of MR views, that by definition deliver unique KV pairs. When combining 2 MR views you can have KV collisions, but in all cases the MRs could be rewritten to avoid collisions (simply postfix the keys that the M emits).
As for the _id of a result row: If the review db would simply be a view index, no _id is needed. If it has to be a db, the _id can simply be the string version of the JSON representation of the key (it will never start with a _ but it's quite wasteful) or some sequence number for the review db.
So there are some things to contemplate, but I think the general idea is pretty solid. Also as we start putting some serious effort into clustering CouchDB there's the eyebrow raising aspect that if we persist to DB's we might be able to leverage a lot of that for some added awesomeness.
I feel the same way.I've also been contemplating using this persistence for _temp_views. I think it can be done, given garbage collection on the temporary review dbs. Then you could use the CouchDB view server farm to calculate multi-dimensional views (all documents with tag A AND tag B AND younger than 2 days).
Wout.
smime.p7s
Description: S/MIME cryptographic signature
