Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread And Clover
Manlio Perillo wrote: Words of *TEXT MAY contain characters from character sets other than ISO-8859-1 [22] only when encoded according to the rules of RFC 2047 Yeah, this is, unfortunately, a lie. The rules of RFC 2047 apply only to RFC*822-family 'atoms' and not elsewhere; indeed, RFC2047 it

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Henry Precheur
On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote: > There is something that I don't understand. > > Some HTTP headers, like Accept-Language, contains data described as > `token`, where: > > token = 1* > > So a token, IMHO, is an opaque string, and it SHOULD not decoded. >

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
And Clover ha scritto: > Manlio Perillo wrote: > >> However what about URI (that is, for PATH_INFO and the like)? >> For URI (if I remember correctly) the suggested encoding is UTF-8, so >> URLS should be decoded using > >> url.decode('utf-8', 'surrogateescape') > >> Is this correct? > > The

Re: [Web-SIG] HTTP headers encoding

2009-12-03 Thread Henry Precheur
On Thu, Dec 03, 2009 at 08:33:19PM +0100, Manlio Perillo wrote: > Right now I'm doing a: username.decode('us-ascii', 'replace') Or like most frameworks you could let the application author deal with the problem, just pass the raw strings to the application. -- Henry PrĂȘcheur __

Re: [Web-SIG] HTTP headers encoding

2009-12-03 Thread Manlio Perillo
Henry Precheur ha scritto: > [...] >> How is authorization username handled in common WSGI frameworks? > > As far as I know, they don't handle this. They just return the string > without dealing with the encoding issues. > > I think there is no correct way of handling this, because 99% of > usern

Re: [Web-SIG] HTTP headers encoding

2009-12-03 Thread Henry Precheur
On Thu, Dec 03, 2009 at 05:09:31PM +0100, Manlio Perillo wrote: > This is really a mess. RFC 2617 doesn't specify any encoding for its headers, so it should be latin-1 everywhere. But on the web nobody respect standards. > How is authorization username handled in common WSGI frameworks? As far a

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Henry Precheur
On Thu, Dec 03, 2009 at 07:35:14PM +0100, And Clover wrote: > >I don't know what the HTTP/Cookie spec says about this. > > The traditional interpretation of RFC2616 is that headers are ISO-8859-1. > > You will notice that no browser correctly follows this. The RFC 2109 & 2965 say that a cookie's

Re: [Web-SIG] HTTP headers encoding

2009-12-03 Thread And Clover
Manlio Perillo wrote: I have written a simple WSGI application that asks authentication credentials Ho ho! This is another area that is Completely Broken Everywhere. It's actually a similar situation to the cookies: - Opera and Chrome send non-ASCII cookie characters in UTF-8. - IE encodes

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread James Y Knight
On Dec 3, 2009, at 1:35 PM, And Clover wrote: > Manlio Perillo wrote: > >> However what about URI (that is, for PATH_INFO and the like)? >> For URI (if I remember correctly) the suggested encoding is UTF-8, so >> URLS should be decoded using > >> url.decode('utf-8', 'surrogateescape') > >> Is t

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
And Clover ha scritto: > [...] >> Cookie data SHOULD be transparent to the server/gateway; however WSGI is >> going to assume that data is encoded in latin-1. > > Yeah. This is no big deal because non-ASCII characters in cookies are > already broken everywhere(*). Given this and other limitations

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread And Clover
Manlio Perillo wrote: However what about URI (that is, for PATH_INFO and the like)? For URI (if I remember correctly) the suggested encoding is UTF-8, so URLS should be decoded using url.decode('utf-8', 'surrogateescape') Is this correct? The currently-discussed proposal is ISO-8859-1,

Re: [Web-SIG] HTTP headers encoding

2009-12-03 Thread Manlio Perillo
Manlio Perillo ha scritto: > Hi. > > I'm doing some tests to try to understand how HTTP headers are encoded > by browsers. > > I have written a simple WSGI application that asks authentication > credentials and then print them on the terminal and return the data as > response, as raw bytes > http

[Web-SIG] HTTP headers encoding

2009-12-03 Thread Manlio Perillo
Hi. I'm doing some tests to try to understand how HTTP headers are encoded by browsers. I have written a simple WSGI application that asks authentication credentials and then print them on the terminal and return the data as response, as raw bytes http://paste.pocoo.org/show/154633/ Then I used

Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
James Y Knight ha scritto: > I move to bless mod_wsgi's definition of WSGI 1.1 [1] > [...] > > [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X Hi. Just a few questions. It is true that HTTP headers can be encoded assuming latin-1; and they can be encoded using PEP 383. However wha