Re: [Web-SIG] Python 3.0 and WSGI 1.0.

P.J. Eby Fri, 08 May 2009 14:58:26 -0700

At 10:37 AM 5/8/2009 -0700, Robert Brewer wrote:

It also explicitly states that "HTTP does not directly support Unicode,
and neither does this interface. All encoding/decoding must be handled
by the application; all strings passed to or from the server must be
standard Python BYTE STRINGS (emphasis mine), not Unicode objects. The
result of using a Unicode object where a string object is required, is
undefined."


It also says what the interpretation is when 'str' is a unicode string type.

PEP 333 is difficult to interpret because it uses the name "str"
synonymously with the concept "byte string", which Python 3000 defies. I
believe the intent was to differentiate unicode from bytes, not elevate
whatever type happens to be called "str" on your Python du jour. It was
and is a mistake to standardize on type names ("str") across platforms
and not on type behavior ("byte string").

Ironically, 'str' is what's consistent in type behavior; the bytestype doesn't supply the same operations.

If Python3 WSGI apps emit unicode strings (py3k type 'str'), you're
effectively saying the server will always call
"chunk.encode('latin-1')". That negates any benefit of using unicode as
the type for the response. That's not "supporting unicode"; that's using
unicode exactly as if it were an opaque byte string. That's seems silly
to me when there is a perfectly useful byte string type.

Compatibility sometimes demands we do silly things. Personally, Ithink it's kind of silly that Python 3 files return incompatible datatypes depending on what mode you open them in, but there's not awhole lot we can do about that.

Meanwhile, existing WSGI code ported to Python 3 is going to yieldstrings until/unless manually converted; AFAIK 2to3 has no way toautomatically detect WSGI-ness and convert your strings to bytes.

I don't see any benefit to that.

There isn't any benefit to doing it by *hand*. However, backwardcompatibility demands that servers *accept* such strings, as they maybe generated by legacy apps.

That's why the Python 3 WSGI amendments say servers MUST accept this,even thought applications SHOULD supply bytes.

That is, for new code, we do want bytes. What we don't want, ever,is unicode characters above #255 in any unicode strings sent as partof the response body.


_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Python 3.0 and WSGI 1.0.

Reply via email to