On May 17, 2007, at 7:04 PM, Giovanni Bajo wrote: > On 13/05/2007 21.31, Guido van Rossum wrote: > >> The answer to all of this is the filesystem encoding, which is >> already >> supported. Doesn't appear particularly difficult to me. > > sys.getfilesystemencoding() is None on most Linux computers I have > access to. > How is the problem solved there? > > In fact, I have a question about this. Can anybody show me a valid > multi-platform Python code snippet that, given a filename as > *unicode* string, > create a file with that name, possibly adjusting the name so to > ignore an > encoding problem (so that the function *always* succeed)? > > def dump_to_file(unicode_filename): > ...
unicode_filename.encode(sys.getfilesystemencoding() or 'ascii', 'xmlcharrefreplace') would work. Although I don't think I've seen a platform where sys.getfilesystemencoding() is None. If I unset LANG/LANGUAGE/LC_*, python reports 'ANSI_X3.4-1968'. But normally on my system it reports 'UTF-8', since I have LANG=en_US.UTF-8. The *really* tricky thing is that on unix systems, if you want to be able to access all the files on the disk, you have to use the byte- string API, as not all filenames are convertible to unicode. But on windows, if you want to be able to access all the files on the disk, you *CANNOT* use the byte-string api, because not all filenames (which are unicode on disk) are convertible to bytestrings via the "mbcs" encoding (which is what getfilesystemencoding() reports). It's quite a pain in the ass really. James _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com