Alexander Belopolsky added the comment:

Martin v. Löwis wrote at #18236 (msg191687):
> int conversion ultimately uses Py_ISSPACE, which conceptually could
> deviate from the Unicode properties (as it is byte-based). This is not
> really an issue, since they indeed match.

Py_ISSPACE matches Unicode White_Space property in the ASII range (first 128 
code points) it differs for byte (code point) values from 128 through 255.  
This leads to the following discrepancy:

>>> int('123\xa0')
123

but

>>> int(b'123\xa0')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 3: invalid 
start byte
>>> int('123\xa0'.encode())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '123\xa0'

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10581>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to