Anne van Kesteren wrote:
On Tue, 05 Jan 2010 08:29:53 +0100, Jonas Sicking <[email protected]> wrote:
Wouldn't it then be better to throw for any non ASCII characters? That
way we don't restrict ourself for when (if?) IETF defines an encoding
for http headers.
The defined encoding is ISO-8859-1 (unfortunately).
Well, that's debatable, as RFC 2616 wasn't sufficiently precise.
What's a fact is that some HTTP APIs treat them as ISO-8859-1 (servlet
API, for instance).
HTTPbis currently has:
"Historically, HTTP has allowed field content with text in the
ISO-8859-1 [ISO-8859-1] character encoding and supported other character
sets only through use of [RFC2047] encoding. In practice, most HTTP
header field values use only a subset of the US-ASCII character encoding
[USASCII]. Newly defined header fields SHOULD limit their field values
to US-ASCII characters. Recipients SHOULD treat other (obs-text) octets
in field content as opaque data." --
<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-08.html#rfc.section.3.2>
At the very least, throwing if the upper byte is non-zero seems like
the right thing to do to prevent silent data loss.
That works for me.
Sounds good to me as well.
Best regards, Julian