> Your proposal says that utf-8b would be used for file systems, but then > you also say that it might be used for command line arguments and > environment variables. So, which specific APIs will it be used with on > Windows and on POSIX systems?
On Windows, the Wide APIs are already used throughout the code base, e.g. SetEnvironmentVariableW/_wenviron. If you need to find out the specific API for a specific functionality, please read the source code. > Or will utf-8b simply not be available > on Windows at all? It will be available, but it won't be used automatically for anything. > What happens if I create a Python version of tar, > utf-8b strings slip in there, and I try to use them on Windows? No need to create it - the tarfile module is already there. By "in there", do you mean on the file system, or in the tarfile? > You also assume that all Windows file system functions strictly conform > to UTF-16 in practice (not just on paper). Have you verified that? No, I don't assume that. I assume that all functions are strictly available in a Wide character version, and have verified that they are. > What's the situation on Windows CE? I can't see how this question is relevant to the PEP. The PEP says this: # On Windows, Python uses the wide character APIs to access # character-oriented APIs, allowing direct conversion of the # environmental data to Python str objects. This is what it already does, and this is what it will continue to do. > Another question on Linux: what happens when I decode a file system path > with utf-8b and then pass the resulting unicode string to Gnome? To > Qt? You probably get moji-bake, or an error, I didn't try. > To windows.forms? To Java? How do you do that, on Linux? > To a unicode regular expression library? You mean, SRE? SRE will match the code points as individual characters, class Cs. You should have been able to find out that for yourself. > To wprintf? Depends on the wprintf implementation. > AFAIK, the behavior of most libraries is > undefined for the kinds of unicode strings you construct, and it may be > undefined in a bad way (crash, buffer overflow, whatever). Indeed so. This is intentional. If you can crash Python that way, nothing gets worse by this PEP - you can then *already* crash Python in that way. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com