On Wed, Jun 23, 2010 at 09:36:45PM +0200, Antoine Pitrou wrote: > I don't think you can't claim, though, that Python 3 makes things > significantly harder for these frameworks. The proof is that many of > them already give the user unicode strings in Python 2.x. They must > have somehow got the decoding right.
Well... Frameworks usually 'simplify' the problem by partly ignoring it. By default they assume the data in the request in UTF-8. You can specify an alternative encoding in most of them. Django [1], Werkzeug [2], and WebOb [3] do that. The problem with this approach is that you still have to deal with weird requests where one thing is unicode, and another is latin-1. Sometime you can even have 2 different encodings in a single header like Cookies. There's no solution to this problem, it has to be solved on a case by case basis. There was a big discussion a while ago on web-sig. I think the consensus was that WSGI for Python 3 should assume that the data is encoded in latin-1 since it's the default encoding according to the RFC. [1] http://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpRequest.encoding [2] http://werkzeug.pocoo.org/documentation/dev/unicode.html#request-and-response-objects [3] http://pythonpaste.org/webob/reference.html#unicode-variables -- Henry PrĂȘcheur _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com