Dirkjan Ochtman created COUCHDB-3003:
----------------------------------------

             Summary: Non-Unicode strings should be URL encoded in headers
                 Key: COUCHDB-3003
                 URL: https://issues.apache.org/jira/browse/COUCHDB-3003
             Project: CouchDB
          Issue Type: Bug
          Components: HTTP Interface
            Reporter: Dirkjan Ochtman


>From this CouchDB-Python issue:

https://github.com/djc/couchdb-python/issues/281

And this requests issue:

https://github.com/kennethreitz/requests/issues/3098

There is this comment:

The header parsing is done by httplib, in the Python standard library; that is 
the part that failed to parse. The failure to parse is understandable though: 
servers should not be shoving arbitrary bytes into headers.

Using UTF-8 for your headers is extremely unwise, as discussed by RFC 7230:

    Historically, HTTP has allowed field content with text in the ISO-8859-1 
charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] 
encoding. In practice, most HTTP header field values use only a subset of the 
US-ASCII charset [USASCII].

In this instance it's not really possible for us to resolve the problem. The 
server should instead be sending urlencoded URLs, or RFC 2047-encoded header 
fields. Either way, httplib is getting confused here, and we can't really step 
in and stop it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to