[issue28561] Report surrogate characters range in utf8_encoder

Xiang Zhang Sun, 30 Oct 2016 00:47:11 -0700

New submission from Xiang Zhang:

In utf8_encoder, when a codecs returns a string with non-ascii characters, it 
raises encodeerror but the start and end position are not perfect. This seems 
like an oversight during evolution. Before, utf8_encoder only recognize one 
surrogate character a time. After 2b5357b38366, it tries to recognize as much 
as possible a time. Patch also includes some cleanup.


----------
files: utf8_encoder.patch
keywords: patch
messages: 279712
nosy: haypo, serhiy.storchaka, xiang.zhang
priority: normal
severity: normal
stage: patch review
status: open
title: Report surrogate characters range in utf8_encoder
type: behavior
Added file: http://bugs.python.org/file45271/utf8_encoder.patch

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue28561>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue28561] Report surrogate characters range in utf8_encoder

Reply via email to