[issue28561] Report surrogate characters range in utf8_encoder

2016-10-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thanks Xiang. Yes, this all is follow up issue25267. -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker

[issue28561] Report surrogate characters range in utf8_encoder

2016-10-30 Thread Roundup Robot
Roundup Robot added the comment: New changeset 542065b03c10 by Serhiy Storchaka in branch '3.6': Issue #28561: Clean up UTF-8 encoder: remove dead code, update comments, etc. https://hg.python.org/cpython/rev/542065b03c10 New changeset ee3670d9bda6 by Serhiy Storchaka in branch 'default': Issue

[issue28561] Report surrogate characters range in utf8_encoder

2016-10-30 Thread Xiang Zhang
Changes by Xiang Zhang : Added file: http://bugs.python.org/file45273/utf8_encoder_v2.patch ___ Python tracker ___

[issue28561] Report surrogate characters range in utf8_encoder

2016-10-30 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- assignee: -> serhiy.storchaka components: +Interpreter Core versions: +Python 3.6, Python 3.7 ___ Python tracker

[issue28561] Report surrogate characters range in utf8_encoder

2016-10-30 Thread Xiang Zhang
New submission from Xiang Zhang: In utf8_encoder, when a codecs returns a string with non-ascii characters, it raises encodeerror but the start and end position are not perfect. This seems like an oversight during evolution. Before, utf8_encoder only recognize one surrogate character a time.