I'm in full agreement with Marc-Andre below, except I don't like (1)
at all -- having used other APIs that always return Unicode (like the
Python XML parsers) it bothers me to get Unicode for no reason at all.
OTOH I think Python 3.0 should be using a Unicode model closer to
Java's.

On 7/11/05, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> Neil Hodgson wrote:
> >    On unicode versions of Windows, for attributes like os.listdir,
> > os.getcwd, sys.argv, and os.environ, which can usefully return unicode
> > strings, there are 4 options I see:
> >
> > 1) Always return unicode. This is the option I'd be happiest to use,
> > myself, but expect this choice would change the behaviour of existing
> > code too much and so produce much unhappiness.
> 
> Would be nice, but will likely break too much code - if you
> let Unicode object enter non-Unicode aware code, it is likely
> that you'll end up getting stuck in tons of UnicodeErrors. If you
> want to get a feeling for this, try running Python with -U command
> line switch.
> 
> > 2) Return unicode when the text can not be represented in ASCII. This
> > will cause a change of behaviour for existing code which deals with
> > non-ASCII data.
> 
> +1 on this one (s/ASCII/Python's default encoding).
> 
> > 3) Return unicode when the text can not be represented in the default
> > code page. While this change can lead to breakage because of combining
> > byte string and unicode strings, it is reasonably safe from the point
> > of view of data integrity as current code is returning garbage strings
> > that look like '?????'.
> 
> -1: code pages are evil and the reason why Unicode was invented
> in the first place. This would be a step back in history.
> 
> > 4) Provide two versions of the attribute, one with the current name
> > returning byte strings and a second with a "u" suffix returning
> > unicode. This is the least intrusive, requiring explicit changes to
> > code to receive unicode data. For patch #1231336 I chose this approach
> > producing sys.argvu and os.environu.
> 
> -1 - this is what Microsoft did for many of their APIs. The
> result is two parallel universes with two sets of features,
> bugs, documentation, etc.
> 
> >     For os.listdir the current behaviour of returning unicode when its
> > argument is unicode can be retained but that is not extensible to, for
> > example, sys.argv.
> 
> I don't think that using the parameter type as "parameter"
> to function is a good idea. However, accepting both strings
> and Unicode will make it easier to maintain backwards
> compatibility.
> 
> >    Since this issue may affect many attributes a common approach
> > should be chosen.
> 
> Indeed.
> 
> --
> Marc-Andre Lemburg
> eGenix.com
> 
> Professional Python Services directly from the Source  (#1, Jul 11 2005)
>  >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>  >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>  >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 
> ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to