Claude Paroz added the comment:
Thanks for the explanations (and history). I realize that changing the
behaviour is probably not an option.
As an example in a framework, we are currently discussing how we will cope with
this in Django: https://code.djangoproject.com/ticket/19468
On the
And Clover added the comment:
WSGI's usage of ISO-8859-1 for all HTTP-byte-originated strings is very much
deliberate; we needed a way to preserve the original input bytes whilst still
using unicode strings, and at the time surrogateescape was not available. The
result is counter-intuitive
Changes by Terry J. Reedy tjre...@udel.edu:
--
nosy: +aclover, pje
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16679
___
___
Python-bugs-list
Phillip J. Eby added the comment:
Wouldn't it be possible to amend PEP ?
Sure... except then it would also be necessary to amend PEP , and also all
WSGI applications already written that assume this, any time in the last nine
years.
This is a known and intended consistent property
New submission from Claude Paroz:
In wsgiref/simple_server.py (WSGIRequestHandler.get_environ), Python 3 is
currently populating the env['PATH_INFO'] variable by decoding the URL path,
assuming it was encoded with 'iso-8859-1', which appears to be wrong, according
to RFC 3986/3987.
For
Graham Dumpleton added the comment:
The requirement per PEP is that the original byte string needs to be
converted to native string (Unicode) with the ISO-8891-1 encoding. This is to
ensure that the original bytes are preserved so that the WSGI application, with
its own knowledge of what
Changes by Berker Peksag berker.pek...@gmail.com:
--
versions: +Python 3.4 -Python 3.5
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16679
___
___
Claude Paroz added the comment:
Attached are my proposed changes.
Also, I just came across http://bugs.python.org/issue3300, which finally led
Python urllib.parse.quote to default to UTF-8 encoding, after a lengthy
discussion.
--
keywords: +patch
Added file:
Graham Dumpleton added the comment:
You can't try UTF-8 and then fall back to ISO-8859-1. PEP requires it
always be ISO-8859-1. If an application needs it as something else, it is the
web applications job to do it.
The relevant part of the PEP is:
On Python platforms where the str or
Claude Paroz added the comment:
I may understand your reasoning when you cannot make any assumptions about the
encoding of a series of bytes.
I think that the case of PATH_INFO is different, because it should comply with
standards, and then you *can* make the assumption that the original path
10 matches
Mail list logo