[
https://issues.apache.org/jira/browse/COUCHDB-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103758#comment-15103758
]
Robert Kowalski commented on COUCHDB-2748:
------------------------------------------
Reopening... I recently work in the WHATWG on the console-spec and was able to
get more knowledge.
Additionally this issue comes up for users of the dashboard and we are not able
to solve it, because browsers behave according to the HTML standard. this issue
applies to all users of the http api that use CouchDB for their website.
RFC 2141 says:
{code}
2.3.1 The "%" character
The "%" character is reserved in the URN syntax for introducing the
escape sequence for an octet. Literal use of the "%" character in a
namespace must be encoded using "%25" in URNs for that namespace.
The presence of an "%" character in an URN MUST be followed by two
characters from the <hex> character set.
{code}
*The presence of an "%" character in an URN MUST be followed by two characters
from the <hex> character set.*
source: https://tools.ietf.org/html/rfc2141#section-2.2 - "The "%" character".
note: please also see the table at the top with the possible forms of an url.
I think [~paul.joseph.davis] mentioned that issue a bit above, too.
This is not the case for the urls I provided as an example.
This is the validation of the urls from the example:
https://validator.w3.org/nu/?doc=http%3A%2F%2Fkowalski.gd%2Fassets%2Fdata%2Fcouchdb-invalid-urls.html
According to the specs, the URLs provided by CouchDB have syntax errors.
The RFC I provided is also similar to the WHATWG spec of the URL:
*Percent-encoded bytes can be used to encode code points that are not URL code
points or are excluded from a syntax production.*
see: https://url.spec.whatwg.org/#url-syntax
see also the section *Percent encoded bytes* after reading the url-syntax
section: https://url.spec.whatwg.org/#percent-encoded-byte
While I agree that it is not a technical error of the general process of Hex
encoding, CouchDBs behaviour is not according to the specs and how current
browsers (and other platforms) work.
With the current implementation in CouchDB it is not possible for Fauxton or
other clients to encode/decode the urls properly, as the encoding libraries
work according to the standard. I even say it is not possible to write a
non-stadard url parser/encoder as it is not possible for a stateless client to
encode the url properly. This is just possible if the the initial input to the
database is known.
> encoding problems with reserved chars
> -------------------------------------
>
> Key: COUCHDB-2748
> URL: https://issues.apache.org/jira/browse/COUCHDB-2748
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Reporter: Robert Kowalski
>
> Let's create a database!
> {noformat}
> curl -X PUT http://localhost:5984/testrainyday
> {noformat}
> I get: {{{"ok":true}}}
> Let's create a document called BANANA%253A21%25
> {noformat}
> curl -X PUT http://localhost:5984/testrainyday/BANANA%253A21%25 -d '{}'
> {noformat}
> CouchDB returns:
> {noformat}
> {"_id":"BANANA%3A21%","_rev":"1-967a00dff5e02add41819138abb3284d"}
> {noformat}
> (note the changed id - it misses the 25)
> lets use the id from the response to retrieve the doc:
> {noformat}
> curl http://localhost:5984/testrainyday/BANANA%3A21%
> {noformat}
> i get:
> {noformat}
> {"error":"not_found","reason":"missing"}
> {noformat}
> :(
> New try:
> curl http://localhost:5984/testrainyday/_all_docs
> returns:
> {noformat}
> {"total_rows":1,"offset":0,"rows":[
> {"id":"BANANA%3A21%","key":"BANANA%3A21%","value":{"rev":"1-967a00dff5e02add41819138abb3284d"}}
> ]}
> {noformat}
> I get BANANA%3A21% as id again, but when I want to curl it or use in my JS
> application, I get `{"error":"not_found","reason":"missing"}`
> I noticed that it works for these two ids:
> curl -X PUT http://localhost:5984/testrainyday/BANANA%25 -d '{}'
> {noformat}
> {"ok":true,"id":"BANANA%","rev":"1-967a00dff5e02add41819138abb3284d"}
> {noformat}
> In this last case it works magically for both ids:
> {noformat}
> (17:54:11) [robert@tequila-work] ~ $ curl -X PUT
> http://localhost:5984/testrainyday/BANANA%25 -d '{}'
> {"ok":true,"id":"BANANA%","rev":"1-967a00dff5e02add41819138abb3284d"}
> (17:55:45) [robert@tequila-work] ~ $ curl
> http://localhost:5984/testrainyday/BANANA%25
> {"_id":"BANANA%","_rev":"1-967a00dff5e02add41819138abb3284d"}
> (17:55:57) [robert@tequila-work] ~ $ curl
> http://localhost:5984/testrainyday/BANANA%
> {"_id":"BANANA%","_rev":"1-967a00dff5e02add41819138abb3284d"}
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)