Pierre Quentel <[email protected]> added the comment:
Many thoughts and tests after...
Glenn, the both of us were wrong : the encoding to use in FieldStorage is
neither latin-1, nor sys.stdin.encoding : I tested form fields with characters
whose utf-8 encoding has bytes that map to undefined in cp1252, the calls to
the decode() method with sys.stdin.encoding failed
The encoding used by the browser is defined in the Content-Type meta tag, or
the content-type header ; if not, the default seems to vary for different
browsers. So it's definitely better to define it
The argument stream_encoding used in FieldStorage *must* be this encoding ; in
this version, it is set to utf-8 by default
But this raises another problem, when the CGI script has to print the data
received. The built-in print() function encodes the string with
sys.stdout.encoding, and this will fail if the string can't be encoded with it.
It is the case on my PC, where sys.stdout.encoding is cp1252 : it can't handle
Arabic or Chinese characters
The solution I have tried is to pass another argument, charset, to the
FieldStorage contructor, defaulting to utf-8. It must be the same as the
charset defined in the CGI script in the Content-Type header
FieldStorage uses this argument to override the built-in print() function :
- flush the text layer of sys.stdin, in case calls to print() have been made
before calling FieldStorage
- get the binary layer of stdout : out = sys.stdout.detach()
- define a function _print this way:
def _print(*strings):
for item in strings:
out.write(str(item).encode(charset))
out.write(b'\r\n')
- override print() :
import builtins
builtins.print = _print
The function print() in the CGI script now sends the strings encoded with
"charset" to the binary layer of sys.stdout. All the tests I made with Arabic
or Chinese input fileds, or file names, succed when using this patch ; so do
test_cgi and cgi_test (slightly modified)
----------
Added file: http://bugs.python.org/file20382/cgi_diff_20110112.txt
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue4953>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com