Neil Hodgson wrote: > On unicode versions of Windows, for attributes like os.listdir, > os.getcwd, sys.argv, and os.environ, which can usefully return unicode > strings, there are 4 options I see: > > 1) Always return unicode. This is the option I'd be happiest to use, > myself, but expect this choice would change the behaviour of existing > code too much and so produce much unhappiness.
Would be nice, but will likely break too much code - if you let Unicode object enter non-Unicode aware code, it is likely that you'll end up getting stuck in tons of UnicodeErrors. If you want to get a feeling for this, try running Python with -U command line switch. > 2) Return unicode when the text can not be represented in ASCII. This > will cause a change of behaviour for existing code which deals with > non-ASCII data. +1 on this one (s/ASCII/Python's default encoding). > 3) Return unicode when the text can not be represented in the default > code page. While this change can lead to breakage because of combining > byte string and unicode strings, it is reasonably safe from the point > of view of data integrity as current code is returning garbage strings > that look like '?????'. -1: code pages are evil and the reason why Unicode was invented in the first place. This would be a step back in history. > 4) Provide two versions of the attribute, one with the current name > returning byte strings and a second with a "u" suffix returning > unicode. This is the least intrusive, requiring explicit changes to > code to receive unicode data. For patch #1231336 I chose this approach > producing sys.argvu and os.environu. -1 - this is what Microsoft did for many of their APIs. The result is two parallel universes with two sets of features, bugs, documentation, etc. > For os.listdir the current behaviour of returning unicode when its > argument is unicode can be retained but that is not extensible to, for > example, sys.argv. I don't think that using the parameter type as "parameter" to function is a good idea. However, accepting both strings and Unicode will make it easier to maintain backwards compatibility. > Since this issue may affect many attributes a common approach > should be chosen. Indeed. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 11 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com