I've been pondering this issue of the weird _design/ doc hack. I'd either agree with Zach on having separately named keys for open or right on *both* ends, or specific to the string and array types, a startswith parameter. I don't much like the startswith idea though as it's not generally applicable.
Also, did I miss what you'd pass in the _design doc scenario as end key assuming right open semantics? On Thu, Feb 5, 2009 at 4:57 PM, Zachary Zolton <[email protected]> wrote: > Maximillian, > > I'd think both _could_ be useful. > > I mean in Ruby we have both for the right-hand boundary of ranges: > irb(main):005:0> (1..5).max > => 5 > irb(main):006:0> (1...5).max > => 4 > > IMHO, it would be better to use a different pair of parameter names, > such that we could easily distinguish between open and closed bounds. > > > Cheers, > > Zach > > > PS. Is it "Maximillian" or "Max"? :^D > > On Thu, Feb 5, 2009 at 3:32 PM, Maximillian Dornseif (JIRA) > <[email protected]> wrote: >> >> [ >> https://issues.apache.org/jira/browse/COUCHDB-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670911#action_12670911 >> ] >> >> Maximillian Dornseif commented on COUCHDB-194: >> ---------------------------------------------- >> >> So far nobody seems against it. >> >> The downside is that it MIGHT break some existing code. >> >>> [startkey, endkey[: provide a right-open range selection method >>> --------------------------------------------------------------- >>> >>> Key: COUCHDB-194 >>> URL: https://issues.apache.org/jira/browse/COUCHDB-194 >>> Project: CouchDB >>> Issue Type: Improvement >>> Components: HTTP Interface >>> Affects Versions: 0.9 >>> Reporter: Maximillian Dornseif >>> Priority: Blocker >>> Fix For: 1.0 >>> >>> >>> While writing something about using CouchDB I came across the issue of >>> "slice indexes" (called startkey and endkey in CouchDB lingo). >>> I found no exact definition of startkey and endkey anywhere in the >>> documentation. Testing reveals that access on _all_docs and on views >>> documents are retuned in the interval >>> [startkey, endkey] = (startkey <= k <= endkey). >>> I don't know if this was a conscious design decision. But I like to promote >>> a slightly different interpretation (and thus API change): >>> [startkey, endkey[ = (startkey <= k < endkey). >>> Both approaches are valid and used in the real world. Ruby uses the >>> inclusive ("right-closed" in math speak) first approach: >>> >> l = [1,2,3,4] >>> >> l.slice(1,2) >>> => [2, 3] >>> Python uses the exclusive ("right-open" in math speak) second approach: >>> >>> l = [1,2,3,4] >>> >>> l[1,2] >>> [2] >>> For array indices both work fine and which one to prefer is mostly an issue >>> of habit. In spoken language both approaches are used: "Have the Software >>> done until saturday" probably means right-open to the client and >>> right-closed to the coder. >>> But if you are working with keys that are more than array indexes, then >>> right-open is much easier to handle. That is because you have to *guess* >>> the biggest value you want to get. The Wiki at >>> http://wiki.apache.org/couchdb/View_collation contains an example of that >>> problem: >>> It is suggested that you use >>> startkey="_design/"&endkey="_design/ZZZZZZZZZ" >>> or >>> startkey="_design/"&endkey="_design/\u9999″ >>> to get a list of all design documents - also the replication system in the >>> db core uses the same hack. >>> This breaks if a design document is named "ZZZZZZZZZTop" or >>> "\9999Iñtërnâtiônàlizætiøn". Such names might be unlikely but we are >>> computer scientists; "unlikely" is a bad approach to software engineering. >>> The think what we really want to ask CouchDB is to "get all documents with >>> keys starting with '_design/'". >>> This is basically impossible to do with right-closed intervals. We could >>> use startkey="_design/"&endkey="_design0″ ('0′ is the ASCII character after >>> '/') and this will work fine ... until there is actually a document with >>> the key "_design0″ in the system. Unlikely, but ... >>> To make selection by intervals reliable currently clients have to guess the >>> last key (the ZZZZ approach) or use the fist key not to include (the >>> _design0 approach) and then post process the result to remove the last >>> element returned if it exactly matches the given endkey value. >>> If couchdb would change to a right-open interval approach post processing >>> would go away in most cases. See >>> http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/ >>> for two real world examples. >>> At least for string keys and float keys changing the meaning to [startkey, >>> endkey[ would allow selections like >>> * "all strings starting with 'abc'" >>> * all numbers between 10.5 and 11 >>> It also would hopefully break not to much existing code. Since the notion >>> of endkey seems to be already considered "fishy" (see the ZZZZZ approach) >>> most code seems to try to avoid that issue. For example >>> 'startkey="_design/"&endkey="_design/ZZZZZZZZZ"' still would work unless >>> you have a design document being named exactly "ZZZZZZZZZ". >> >> -- >> This message is automatically generated by JIRA. >> - >> You can reply to this email to add a comment to the issue online. >> >> >
