On Dec 31, 2008, at 7:37 AM, Robert Dionne wrote:
On Dec 30, 2008, at 8:33 PM, Antony Blakey wrote:
1. The current scheme of prepending _ to atom names when the atom
is used inside a document. Con is the breakage of name identity,
which has technical consequences as well as cognitive ones. Does
the rule only apply at the top level of a document? What about
future injected metadata that has internal structure?
2. Use '_' for all atoms, inside and outside documents. Con is the
noise of extra underscores everywhere.
3. Don't use underscores inside documents - for id and rev at
least, this wouldn't seem to be a big issue, but isn't future-proof
if you want to handle other injected fields.
4. Use '_' for atoms that have to be injected, and make the name BE
the '_' form. Con is that you have to decide in advance if an atom
is going to ever be injected.
5. Use a '_meta' wrapper for the metadata. I don't see any
technical cons, and IMO is by far the cleanest model. Name identity
is preserved, it's arbitrarily extensible without scalability
concerns, and is structural rather than lexical.
It is clearly cleaner and has it's advantages, however I have to
agree with an earlier poster; "Putting them in a _meta group might
encourage aggregation and manipulation of the bookkeeping metadata
separately from the document, which to me sounds like a recipe for
trouble."
What trouble? I think this is *exactly* what should be done - have
CouchDB store documents that are :
{
metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that
needs to be added in the future, like other metadata like last update
date... },
userdata : { .... the document you want to store .... }
}
and then offer APIs that let you :
a) get to this document, for libraries and clients that know they are
talking to Couch and want to manipulate at this level
b) return and accept the userdocument directly, for clients that just
want to consume or produce JSON data, w/o caring about the internal
housekeeping
This would be a more complex design than the current use of the
underscore at the top level of documents and would definitely
encourage a quite different implementation. I don't know the
internals enough yet to comment on this. The code there to date is
remarkably terse for what it does but this may just reflect the use
of Erlang.
I just have trouble seeing this POV - it seems to me that having a
reserved "namespace" ( the _.*) at a specific level (the top level)
the user document to put the metadata makes things more complex. Not
only is it an exception to what a person can store in couch, but it
itself contains an exception - it only applies to top level.
Consumers and producers have to be aware that the documents are coming
from Couch (consumers have to know that _id and _rev are medatadata
and should be ignored for application purposes, but only if in the top
level...) and producers have to avoid using _id and _rev for
application data....
Then any "couch aware" code I write can safely know that anything
that's couch specific is in doc.metadata and anything that's the
stored user data is data.userdata, and never the beams shall be
crossed. Any apps I write (say AJAX stuff) don't need to special-case
the handling of the responses, since anything in a user doc is user
data, and I should be able to make requests that just return that
userdata
geir