Alexander Belopolsky wrote: > Two recently reported issues brought into light the fact that Python > language definition is closely tied to character properties maintained > by the Unicode Consortium. [1,2] For example, when Python switches to > Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two > additional characters that Python can use in identifiers. [3] > > With Python 3.1: > >>>> exec('\u0CF1 = 1') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "<string>", line 1 > ೱ = 1 > ^ > SyntaxError: invalid character in identifier > > but with Python 3.2a4: > >>>> exec('\u0CF1 = 1') >>>> eval('\u0CF1') > 1
Such changes are not new, but I agree that they should probably be highlighted in the "What's new in Python x.x". > Of course, the likelihood is low that this change will affect any > user, but the change in str.isspace() reported in [1] is likely to > cause some trouble: > > Python 2.6.5: >>>> u'A\u200bB'.split() > [u'A', u'B'] > > Python 2.7: >>>> u'A\u200bB'.split() > [u'A\u200bB'] That's a classical bug fix. > While we have little choice but to follow UCD in defining > str.isidentifier(), I think Python can promise users more stability in > what it treats as space or as a digit in its builtins. Why should we divert from the work done by the Unicode Consortium ? After all, most of their changes are in fact bug fixes as well. > For example, > I don't think that supporting > >>>> float('١٢٣٤.٥٦') > 1234.56 > > is more important than to assure users that once their program > accepted some text as a number, they can assume that the text is > ASCII. Sorry, but I don't agree. If ASCII numerals are an important aspect of an application, the application should make sure that only those numerals are used (e.g. by using a regular expression for checking). In a Unicode world, not accepting non-Arabic numerals would be a limitation, not a feature. Besides Python has had this support since Python 1.6. > [1] http://bugs.python.org/issue10567 > [2] http://bugs.python.org/issue10557 > [3] http://www.unicode.org/versions/Unicode6.0.0/#Database_Changes -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 28 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com