[
https://issues.apache.org/jira/browse/COUCHDB-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743063#action_12743063
]
Chris Anderson commented on COUCHDB-194:
----------------------------------------
Eric,
Patches are totally welcome on this. _by_seq probably didn't get much attention
lately, as I think it's still deprecated in favor of _changes.
It's worth noting that docids are not collated with ICU when they are in the
_all_docs view, so there are some places the collation rules can differ.
if you are able to prepare a test case illustrating the change you'd like, it
probably won't be hard to find something to finish the patch.
> [startkey, endkey[: provide a right-open range selection method
> ---------------------------------------------------------------
>
> Key: COUCHDB-194
> URL: https://issues.apache.org/jira/browse/COUCHDB-194
> Project: CouchDB
> Issue Type: Improvement
> Components: HTTP Interface
> Affects Versions: 0.10
> Reporter: Maximillian Dornseif
> Priority: Blocker
> Fix For: 0.10
>
>
> While writing something about using CouchDB I came across the issue of "slice
> indexes" (called startkey and endkey in CouchDB lingo).
> I found no exact definition of startkey and endkey anywhere in the
> documentation. Testing reveals that access on _all_docs and on views
> documents are retuned in the interval
> [startkey, endkey] = (startkey <= k <= endkey).
> I don't know if this was a conscious design decision. But I like to promote a
> slightly different interpretation (and thus API change):
> [startkey, endkey[ = (startkey <= k < endkey).
> Both approaches are valid and used in the real world. Ruby uses the inclusive
> ("right-closed" in math speak) first approach:
> >> l = [1,2,3,4]
> >> l.slice(1,2)
> => [2, 3]
> Python uses the exclusive ("right-open" in math speak) second approach:
> >>> l = [1,2,3,4]
> >>> l[1,2]
> [2]
> For array indices both work fine and which one to prefer is mostly an issue
> of habit. In spoken language both approaches are used: "Have the Software
> done until saturday" probably means right-open to the client and right-closed
> to the coder.
> But if you are working with keys that are more than array indexes, then
> right-open is much easier to handle. That is because you have to *guess* the
> biggest value you want to get. The Wiki at
> http://wiki.apache.org/couchdb/View_collation contains an example of that
> problem:
> It is suggested that you use
> startkey="_design/"&endkey="_design/ZZZZZZZZZ"
> or
> startkey="_design/"&endkey="_design/\u9999″
> to get a list of all design documents - also the replication system in the db
> core uses the same hack.
> This breaks if a design document is named "ZZZZZZZZZTop" or
> "\9999Iñtërnâtiônàlizætiøn". Such names might be unlikely but we are computer
> scientists; "unlikely" is a bad approach to software engineering.
> The think what we really want to ask CouchDB is to "get all documents with
> keys starting with '_design/'".
> This is basically impossible to do with right-closed intervals. We could use
> startkey="_design/"&endkey="_design0″ ('0′ is the ASCII character after '/')
> and this will work fine ... until there is actually a document with the key
> "_design0″ in the system. Unlikely, but ...
> To make selection by intervals reliable currently clients have to guess the
> last key (the ZZZZ approach) or use the fist key not to include (the _design0
> approach) and then post process the result to remove the last element
> returned if it exactly matches the given endkey value.
> If couchdb would change to a right-open interval approach post processing
> would go away in most cases. See
> http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/
> for two real world examples.
> At least for string keys and float keys changing the meaning to [startkey,
> endkey[ would allow selections like
> * "all strings starting with 'abc'"
> * all numbers between 10.5 and 11
> It also would hopefully break not to much existing code. Since the notion of
> endkey seems to be already considered "fishy" (see the ZZZZZ approach) most
> code seems to try to avoid that issue. For example
> 'startkey="_design/"&endkey="_design/ZZZZZZZZZ"' still would work unless you
> have a design document being named exactly "ZZZZZZZZZ".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.