Alexander Belopolsky wrote:

"""
Because the most commonly used characters are all in the Basic
Multilingual Plane, converting between surrogate pairs and the
original values is often not tested thoroughly. This leads to
persistent bugs, and potential security holes, even in popular and
well-reviewed application software.
"""

Maybe Python should have used UTF-8 as its internal unicode
representation. Then people who were foolish enough to assume
one character per string item would have their programs break
rather soon under only light unicode testing. :-)

--
Greg
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to