David Hopwood <[EMAIL PROTECTED]> wrote: > Brett Cannon wrote: [snip] > > If you want that kind of > > exposure, use the bytes type. Otherwise assume the usage will be by people > > ignorant of Unicode and thus want something that will work the way they are > > used to when compared to working in ASCII. > > It simply is not possible to do correct string processing in Unicode that > will "work the way [programmers] are used to when compared to working in > ASCII". > > The Unicode standard is on-line at www.unicode.org, and is quite well written, > with lots of motivation and explanation of how processing international texts > necessarily differs from working with ASCII. There is no excuse for any > programmer doing text processing not to have read it.
Since, basically everyone using Python today performs "text processing" in one way or another, you are saying that basically everyone should be reading the Unicode spec before using Python. Nevermind that the document is generally larger than most people want to be reading, and that you didn't provide a link to the most applicable section (with regards to *using* unicode). I will also mention that in the unicode 4.0 spec, Chapter 5 "Implementation Guidelines" starts with: ''' It is possible to implement a substantial subset of the Unicode Standard as "wide ASCII" with little change to existing programming practice. ... ''' It later goes on to explain where "wide ASCII" is not a reasonable strategy, but I'm not sure that users of Python necessarily need to know all of that. > Should we nevertheless try to avoid making the use of Unicode strings > unnecessarily difficult for people who have minimal knowledge of Unicode? > Absolutely, but not at the expense of making basic operations on strings > asymptotically less efficient. O(1) indexing and slicing is a basic > requirement, even if it has to be done using code units. I believe you mean "code points", "code units" imply non-O(1) indexing and slicing (variable-width characters). - Josiah _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
