Hi, Python 3.0 is released and supports unicode everywhere, great! But as pointed by different people, bytes are required on non-Windows OS for backward compatibility. This email is just a sum up all many issues/email threads.
Problems with Python 3.0: (1) Invalid unicode string on the command line => some people wants to get the command line arguments as bytes and so start even if non decodable unicode strings are present on the command line => http://bugs.python.org/issue3023 (2) Non decodable environment variables are skipped in os.environ => Create os.environb (or anything else) to get these variables as bytes (and be able to setup new variables as bytes) => Read the email thread "Python-3.0, unicode, and os.environ" (Decembre 2008) opened by Toshio Kuratomi (3) Support bytes for os.exec*() and subprocess.Popen(): process arguments and the environment variables => http://bugs.python.org/issue4035: my patch for os.exec*() => http://bugs.python.org/issue4036: my patch for subprocess.Popen() Command line ============ I like the curent behaviour and I don't want to change it. Be free to propose a solution to solve the issue ;-) Environment =========== I already proposed "os.environb" which will have the similar API than "os.environ" but with bytes. Relations between os.environb and os.environ: - for an undecodable variable value in os.environb, os.environ will raise a KeyError. Example with utf8 charset and os.environb[b'PATH'] = '\xff': path=os.environ['PATH'] will raise a KeyError to keep the current behaviour. - os.environ raises an UnicodeDecodeError if the key or value can not be encoded in the current charset. Example with ASCII charset: os.environ['PATH'] = '/home/hayp\xf4' - except undecodable variable values in os.environb, os.environ and os.environb will be consistent. Example: delete a variable in os.environb will also delete the key in os.environ. I think that most of these points (or all points) are ok for everyone (especially ok for Toshio Kuratomi and me :-)). Now I have to try to write an implementation of this, but it's complex, especially to keep os.environ and os.environb consistents! Processes ========= I proposed patches to fix non-Windows OS, but Antoine Pitrou wants also bytes on Windows. Amaury wrote that it's possible using the ANSI version of the Windows API. I don't know this API and so I can not contribute to this point. --- Rejected idea ============= Use a private Unicode block causes interoperability problems: - the block may be already used by other programs/libraires - 3rd party programs/libraries don't understand this block and may have problems this display/process the data (Is the idea really rejected? It has at least many problems) --- I don't have new solutions, it's just an email to restart the discussion about bytes ;-) Martin also asked for a PEP to change the posix module API to support bytes. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com