[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Martin v . Löwis
Changes by Martin v. Löwis : -- resolution: -> invalid status: open -> closed ___ Python tracker ___ ___ Python-bugs-list mailing lis

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: > Thanks for the explanation. I wrongly assumed that "make all" is the > way to regenerate both unicodedata and the encodings and that the two > are interdependent. Ah. I never use the Makefile. -- ___ Python trac

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Martin v. Löwis wrote: > > Martin v. Löwis added the comment: > > This is not a bug, see > > http://www.unicode.org/reports/tr44/#Numeric_Value > > Characters have a Numeric_Type property of either null, Decimal, Digit, or > Numeric. For non-Unihan cha

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: > I fail to see the relevance of gencodec to this issue ... Thanks for the explanation. I wrongly assumed that "make all" is the way to regenerate both unicodedata and the encodings and that the two are interdependent. -- dependencies: -Tools

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > On Mon, Nov 29, 2010 at 1:29 PM, Marc-Andre Lemburg > wrote: > .. >> >> I consider this a bug (which is why I added Python 2.7 to the list >> of versions), since those code point

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: This is not a bug, see http://www.unicode.org/reports/tr44/#Numeric_Value Characters have a Numeric_Type property of either null, Decimal, Digit, or Numeric. For non-Unihan characters, this is denoted by filling out either no column, or (6,7,and 8), or (7 a

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: > I am adding #10552 as a dependency because I think we should fix > unicode data generation in 3.x before adding new features to the > scripts. > > I am also not sure whether this is a bug or a feature request. > Martin? I fail to see the relevance of gencod

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Mon, Nov 29, 2010 at 1:29 PM, Marc-Andre Lemburg wrote: .. > > I consider this a bug (which is why I added Python 2.7 to the list > of versions), since those code points need to be mapped to decimal > and digit as well (see the references I posted; and

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > I am adding #10552 as a dependency because I think we should fix unicode data > generation in 3.x before adding new features to the scripts. > > I am also not sure whether this

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.py

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: I am adding #10552 as a dependency because I think we should fix unicode data generation in 3.x before adding new features to the scripts. I am also not sure whether this is a bug or a feature request. Martin? -- dependencies: +Tools/unicode/gen

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: This is the definition of kPrimaryNumeric http://ftp.lanet.lv/ftp/mirror/unicode/5.0.0/ucd/Unihan.html#kPrimaryNumeric -- ___ Python tracker _

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Here's a quick overview of the fields that are set for U+4E09: http://www.fileformat.info/info/unicode/char/4e09/index.htm -- ___ Python tracker _

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: The code point is also not listed as decimal digit (relevant for the int() decimal parsing): >>> unicodedata.decimal(unicode('三', 'utf-8')) Traceback (most recent call last): File "", line 1, in ValueError: not a decimal This is the relevant part of th

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
New submission from Marc-Andre Lemburg : The script only patches numeric data into the table (field 8), but does not update the digit field (field 7). As a result, ideographs used for Chinese digits are not recognized as digits and not evaluated by int(), long() and float(): http://en.wik