"Martin v. Löwis" wrote:
> Am 30.11.2010 21:24, schrieb Ben Finney:
>> haiyang kang <corn...@gmail.com> writes:
>>
>>>   I think it is a little ugly to have code like this: num =
>>> float("一.一"), expected result is: num = 1.1
>>
>> That's a straw man, though. The string need not be a literal in the
>> program; it can be input to the program.
>>
>>     num = float(input_from_the_external_world)
>>
>> Does that change your assessment of whether non-ASCII digits are used?
> 
> I think the OP (haiyang kang) already indicated that he finds it quite
> unlikely that anybody would possibly want to enter that. You would need
> a number of key strokes to enter each individual ideograph, plus you
> have to press the keys for keyboard layout switching to enter the Latin
> decimal separator (which you normally wouldn't use along with the Han
> numerals).

That's a somewhat limited view, IMHO. Numbers are not always entered
using a computer keyboard, you have tool like cash registries, special
numeric keypads, scanners, OCR, etc. for external entry, and you also
have other programs producing such output, e.g. MS Office if configured
that way.

The argument with the decimal point doesn't work well either, since
it's obvious that float() and int() do not support localized input.

E.g. in Germany we write 3,141 instead of 3.141:

>>> float('3,141')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): 3,141

No surprise there. The localization of the input data, e.g. removal
of thousands separators and conversion of decimal marks to the dot,
have to be done by the application, just like you have to now for
German floating point number literals.

The locale module already has locale.atof() and locale.atoi() for
just this purpose.

FYI, here's a list of decimal digits supported by Python 2.7:

http://www.unicode.org/Public/5.2.0/ucd/extracted/DerivedNumericType.txt:
"""
0030..0039    ; Decimal # Nd  [10] DIGIT ZERO..DIGIT NINE
0660..0669    ; Decimal # Nd  [10] ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT 
NINE
06F0..06F9    ; Decimal # Nd  [10] EXTENDED ARABIC-INDIC DIGIT ZERO..EXTENDED 
ARABIC-INDIC DIGIT NINE
07C0..07C9    ; Decimal # Nd  [10] NKO DIGIT ZERO..NKO DIGIT NINE
0966..096F    ; Decimal # Nd  [10] DEVANAGARI DIGIT ZERO..DEVANAGARI DIGIT NINE
09E6..09EF    ; Decimal # Nd  [10] BENGALI DIGIT ZERO..BENGALI DIGIT NINE
0A66..0A6F    ; Decimal # Nd  [10] GURMUKHI DIGIT ZERO..GURMUKHI DIGIT NINE
0AE6..0AEF    ; Decimal # Nd  [10] GUJARATI DIGIT ZERO..GUJARATI DIGIT NINE
0B66..0B6F    ; Decimal # Nd  [10] ORIYA DIGIT ZERO..ORIYA DIGIT NINE
0BE6..0BEF    ; Decimal # Nd  [10] TAMIL DIGIT ZERO..TAMIL DIGIT NINE
0C66..0C6F    ; Decimal # Nd  [10] TELUGU DIGIT ZERO..TELUGU DIGIT NINE
0CE6..0CEF    ; Decimal # Nd  [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE
0D66..0D6F    ; Decimal # Nd  [10] MALAYALAM DIGIT ZERO..MALAYALAM DIGIT NINE
0E50..0E59    ; Decimal # Nd  [10] THAI DIGIT ZERO..THAI DIGIT NINE
0ED0..0ED9    ; Decimal # Nd  [10] LAO DIGIT ZERO..LAO DIGIT NINE
0F20..0F29    ; Decimal # Nd  [10] TIBETAN DIGIT ZERO..TIBETAN DIGIT NINE
1040..1049    ; Decimal # Nd  [10] MYANMAR DIGIT ZERO..MYANMAR DIGIT NINE
1090..1099    ; Decimal # Nd  [10] MYANMAR SHAN DIGIT ZERO..MYANMAR SHAN DIGIT 
NINE
17E0..17E9    ; Decimal # Nd  [10] KHMER DIGIT ZERO..KHMER DIGIT NINE
1810..1819    ; Decimal # Nd  [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE
1946..194F    ; Decimal # Nd  [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE
19D0..19DA    ; Decimal # Nd  [11] NEW TAI LUE DIGIT ZERO..NEW TAI LUE THAM 
DIGIT ONE
1A80..1A89    ; Decimal # Nd  [10] TAI THAM HORA DIGIT ZERO..TAI THAM HORA 
DIGIT NINE
1A90..1A99    ; Decimal # Nd  [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM 
DIGIT NINE
1B50..1B59    ; Decimal # Nd  [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE
1BB0..1BB9    ; Decimal # Nd  [10] SUNDANESE DIGIT ZERO..SUNDANESE DIGIT NINE
1C40..1C49    ; Decimal # Nd  [10] LEPCHA DIGIT ZERO..LEPCHA DIGIT NINE
1C50..1C59    ; Decimal # Nd  [10] OL CHIKI DIGIT ZERO..OL CHIKI DIGIT NINE
A620..A629    ; Decimal # Nd  [10] VAI DIGIT ZERO..VAI DIGIT NINE
A8D0..A8D9    ; Decimal # Nd  [10] SAURASHTRA DIGIT ZERO..SAURASHTRA DIGIT NINE
A900..A909    ; Decimal # Nd  [10] KAYAH LI DIGIT ZERO..KAYAH LI DIGIT NINE
A9D0..A9D9    ; Decimal # Nd  [10] JAVANESE DIGIT ZERO..JAVANESE DIGIT NINE
AA50..AA59    ; Decimal # Nd  [10] CHAM DIGIT ZERO..CHAM DIGIT NINE
ABF0..ABF9    ; Decimal # Nd  [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT 
NINE
FF10..FF19    ; Decimal # Nd  [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE
104A0..104A9  ; Decimal # Nd  [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE
1D7CE..1D7FF  ; Decimal # Nd  [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL 
MONOSPACE DIGIT NINE
"""

The Chinese and Japanese ideographs are not supported because of the
way they are defined in the Unihan database. I'm currently
investigating how we could support them as well.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 01 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to