>> I don't follow all the implications of this thread, but I thought I should >> mention that in python3, unicode and str are identical. If you pass 'bytes' >> into etree.fromstring, you get bytes back. If you pass 'str' into >> etree.fromstring, you get a (unicode) string back. > >In Python 3, I think we should have String = Unicode as that distinction >becomes useless (as you point out), and there's ByteArray for binary >matters anyway. > >Here's my train of thought; bear with me: > >The issue here is that Unicode type may produce a str instance when data >is all ascii (because of some quirky lxml behaviour). This definitely >breaks the contract, but I can't think of any negative side effects of >this behaviour. And as Dieter points out, str is sometimes easier to >work with.
How i find this out: I'm using 'unicodedata.normalize' in one of my API function and it _requires_ that second argument is unicode and not str. While rpclib was always returning unicode i just passes it to to normalize(). This stopped to work after lxml upgrade cos suddenly there were situations in which i got str instead of unicode. The fix was, of course, very easy but i don't consider this a good behavior - one really can't except that value is 'unicode' in one situation and 'str' in anoher. It should be always 'str' or always 'unicode'. > >I don't want to go back to the old all-unicode behaviour just for the >sake of preserving backwards compatibility (where String = Unicode also >for Python 2). But we could be breaking it to have little to no benefit >at all. > >Considering all this, my decision is to separate String and Unicode for >Python 2, as; > > Explicit is better than implicit. > > >If you disagree, speak now or forever hold your silence :)) > >Best, >Burak > > > > >---------- > >_______________________________________________ >Soap mailing list >[email protected] >http://mail.python.org/mailman/listinfo/soap > _______________________________________________ Soap mailing list [email protected] http://mail.python.org/mailman/listinfo/soap
