Dirkjan Ochtman created COUCHDB-3003:
----------------------------------------
Summary: Non-Unicode strings should be URL encoded in headers
Key: COUCHDB-3003
URL: https://issues.apache.org/jira/browse/COUCHDB-3003
Project: CouchDB
Issue Type: Bug
Components: HTTP Interface
Reporter: Dirkjan Ochtman
>From this CouchDB-Python issue:
https://github.com/djc/couchdb-python/issues/281
And this requests issue:
https://github.com/kennethreitz/requests/issues/3098
There is this comment:
The header parsing is done by httplib, in the Python standard library; that is
the part that failed to parse. The failure to parse is understandable though:
servers should not be shoving arbitrary bytes into headers.
Using UTF-8 for your headers is extremely unwise, as discussed by RFC 7230:
Historically, HTTP has allowed field content with text in the ISO-8859-1
charset [ISO-8859-1], supporting other charsets only through use of [RFC2047]
encoding. In practice, most HTTP header field values use only a subset of the
US-ASCII charset [USASCII].
In this instance it's not really possible for us to resolve the problem. The
server should instead be sending urlencoded URLs, or RFC 2047-encoded header
fields. Either way, httplib is getting confused here, and we can't really step
in and stop it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)