Re: [Python-Dev] Python and the Unicode Character Database

Alexander Belopolsky Sun, 28 Nov 2010 17:26:17 -0800

On Sun, Nov 28, 2010 at 7:55 PM, Ben Finney <[email protected]> wrote:
..
>> Of course it is fun that Python can process Bengali numerals, but so
>> would be allowing Roman numerals. There is a reason why after careful
>> consideration, PEP 313 was ultimately rejected.
>
> Rejecting a proposed *new* capability is a different matter from
> disabling an *existing* capability which works in existing Python
> releases.


Was this capability ever documented?  It does not feel like a
deliberate feature.  If it was, '\N{ARABIC DECIMAL SEPARATOR}' would
be accepted in arabic-indic notation.   If feels more like a CPython
implementation detail similar to say:

>>> int('10') is 10
True
>>> int('10000') is 10000
False

Note that the underlying PyUnicode_EncodeDecimal() function is
described in the unicodeobject.h header file as follows:

/* --- Decimal Encoder ---------------------------------------------------- */

/* Takes a Unicode string holding a decimal value and writes it into
   an output buffer using standard ASCII digit codes.
  ..
  The encoder converts whitespace to ' ', decimal characters to their
   corresponding ASCII digit and all other Latin-1 characters except
   \0 as-is. Characters outside this range (Unicode ordinals 1-256)
   are treated as errors. This includes embedded NULL bytes.
 */

So the support for non-ASCII digits is accidental and should be
treated as a bug.
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python and the Unicode Character Database

Reply via email to