On Dec 31, 2008, at 7:37 AM, Robert Dionne wrote:


On Dec 30, 2008, at 8:33 PM, Antony Blakey wrote:


1. The current scheme of prepending _ to atom names when the atom is used inside a document. Con is the breakage of name identity, which has technical consequences as well as cognitive ones. Does the rule only apply at the top level of a document? What about future injected metadata that has internal structure?

2. Use '_' for all atoms, inside and outside documents. Con is the noise of extra underscores everywhere.

3. Don't use underscores inside documents - for id and rev at least, this wouldn't seem to be a big issue, but isn't future-proof if you want to handle other injected fields.

4. Use '_' for atoms that have to be injected, and make the name BE the '_' form. Con is that you have to decide in advance if an atom is going to ever be injected.

5. Use a '_meta' wrapper for the metadata. I don't see any technical cons, and IMO is by far the cleanest model. Name identity is preserved, it's arbitrarily extensible without scalability concerns, and is structural rather than lexical.

It is clearly cleaner and has it's advantages, however I have to agree with an earlier poster; "Putting them in a _meta group might encourage aggregation and manipulation of the bookkeeping metadata separately from the document, which to me sounds like a recipe for trouble."

What trouble? I think this is *exactly* what should be done - have CouchDB store documents that are :

 {
metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that needs to be added in the future, like other metadata like last update date... },
    userdata : {  .... the document you want to store .... }
 }

and then offer APIs that let you :

a) get to this document, for libraries and clients that know they are talking to Couch and want to manipulate at this level

b) return and accept the userdocument directly, for clients that just want to consume or produce JSON data, w/o caring about the internal housekeeping



This would be a more complex design than the current use of the underscore at the top level of documents and would definitely encourage a quite different implementation. I don't know the internals enough yet to comment on this. The code there to date is remarkably terse for what it does but this may just reflect the use of Erlang.

I just have trouble seeing this POV - it seems to me that having a reserved "namespace" ( the _.*) at a specific level (the top level) the user document to put the metadata makes things more complex. Not only is it an exception to what a person can store in couch, but it itself contains an exception - it only applies to top level. Consumers and producers have to be aware that the documents are coming from Couch (consumers have to know that _id and _rev are medatadata and should be ignored for application purposes, but only if in the top level...) and producers have to avoid using _id and _rev for application data....

Then any "couch aware" code I write can safely know that anything that's couch specific is in doc.metadata and anything that's the stored user data is data.userdata, and never the beams shall be crossed. Any apps I write (say AJAX stuff) don't need to special-case the handling of the responses, since anything in a user doc is user data, and I should be able to make requests that just return that userdata

geir

Reply via email to