Pierre Quentel <[email protected]> added the comment:
@Glenn
"I'm curious what your system (probably Windows since you mention cp-) and
browser, and HTTP server is, that you used for that test. Is it possible to
capture the data stream for that test? Describe how, and at what stage the
data stream was captured, if you can capture it. Most interesting would be on
the interface between browser and HTTP server."
I tested it on Windows XP Family Edition 2020, Service Pack 3, with Python 3.2b2
Browsers : Mozilla Firefox 3.6.13 and Internet Explorer 7.0
Servers : Apache 2.2, and the built-in server started by :
import http.server
http.server.test(HandlerClass=http.server.CGIHTTPRequestHandler)
I print the bytes received in the multipart/form-data part by
"print(odelim+line)" at the end of method read_lines_to_outerboundary() of
FieldStorage. The bytes sent when I enter the string
"a"+"n tilde" + the euro sign
are : b'a\xf1\x80' - that is, the cp-1252 encoding of the string
Since it works the same with 2 browsers and 2 web servers, I'm almost sure it's
not dependant on the configuration - but if others can tests on different
configurations I'd like to know the result
Basically, this behaviour is not surprising : if sys.stdin.encoding is set to a
certain value, it's natural that the bytes sent on the binary layer are encoded
with this encoding, not with latin-1
I attach the diff file for an updated version of cgi.py :
- new argument stream_encoding instead of setting an attribute "encoding" to fp
- use locale.getpreferredencoding() to decode the query string
----------
Added file: http://bugs.python.org/file20356/cgi_diff_20110111.txt
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue4953>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com