On 12/12/2008, at 6:22 AM, Damien Katz wrote:


On Dec 11, 2008, at 2:39 PM, Chris Anderson wrote:

On Wed, Dec 3, 2008 at 3:59 PM, Antony Blakey <[email protected] > wrote:

On 04/12/2008, at 9:55 AM, Chris Anderson wrote:

On Wed, Dec 3, 2008 at 6:09 AM, Adam Kocoloski <[email protected] >
wrote:

2) The "/" in the _design doc ID is confusing.

Oh someone, please make it easy! (and correct)

Someone please make it absolutely, 100%, correct.


The more I program against Couch, especially in a browser, the more I
run into issues where different parts of the toolchain tend toward
auto-unescaping %2F. It's hard to be certain that I've got something
absolutely, 100% correct, but we'll never get there if we don't start.

Here are some examples which assume that docid's slashes will be
urlencoded (unless the docid starts with '_'). This is the current
rule (roughly). Each example has 2 urls with attachments that have no
slashes in the name, followed by a url with an attachment with
multiple slashes. I think it is feasible to allow this sort of thing
to happen, by putting a little bit of special-case logic in the
routing code. I don't think doing so breaks anything fundamental about
CouchDB.

regular docs:

/db/docid
/db/docid/afile
/db/docid/afile/with/nested/slashes

design docs:

/db/_design/name
/db/_design/name/afile
/db/_design/name/afile/with/nested/slashes


If your docid does not start with '_' (eg not a local or design doc)
then any slashes in the docid would have to be escaped. This is so we
can know when attachment addressing begins. Also, design docs with
slashes after the inital one (slashes in the name) would have to
escape them.

regular doc with slashes in id:

/db/docid%2Fwith%2Fslashes
/db/docid%2Fwith%2Fslashes/afile
/db/docid%2Fwith%2Fslashes/afile/with/nested/slashes

design doc with slashes in name:

/db/_design/name%2Fwith%2Fslashes
/db/_design/name%2Fwith%2Fslashes/afile
/db/_design/name%2Fwith%2Fslashes/afile/with/nested/slashes

Special names, special paths, sometimes encoding, sometimes not. Such magic
is evil because it always comes back to bite your arse.


I think I may have this correct - eg non arse biting. But I'm posting
to the dev list because y'all might see what I don't.

I plan to put this into trunk before 1.0 (I think it will be backwards
compatible). Comments?

Chris

--
Chris Anderson
http://jchris.mfdz.com


I agree with everything but slashes in design doc named.

So the guidance is that users must not use document names starting with '_' if they want to avoid astonishment?

The other alternate is to always require the component after the db to be 'special' i.e. document URLs could be

  /db/_/docid%2Fwith%2Fslashes/afile/with/nested/slashes

No special rules required. IMO this example makes clear the cause of the issue.

I think we probably shouldn't support design docs with slashes, and maybe all other weird characters.

I think all document names should be Unicode.

For one thing, we use the design doc name as the file name for the view index file for the views. This is an issue that can prove problematic on certain platforms and not others.

The file name can be escaped. There are also limitations on the length of the filename depending on the platform. I suggest using an escaped form of some initial segment of the name, concatenated with an escaped form of some final segment of the name, concatenated wit a hash of the full name.

If the name is less than a certain length, then just escape the full name.

Also, provide a handler that returns a json document associating filenames with the original name. This exposes the mapping implementation in way that can be used by developers. Maybe also a handler to map from an arbitrary string to a filename, using couch's mapping function. Useful for plugin/_external authors who want to use local files.

IMO, limiting the names of things because of filesystem limitations is a bad example of abstraction leakage.

If the design doc has weird characters that aren't supported in the file system, we can't make the index file. If we hash the filename, then it's impossible for an admin to figure out which files are which from the command line. So maybe we should url escape the name for the file system too. Or just not support weird characters at all.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

If at first you don’t succeed, try, try again. Then quit. No use being a damn fool about it
  -- W.C. Fields

Reply via email to