Marc-Andre Lemburg <m...@egenix.com> added the comment:

Ezio Melotti wrote:
> 
> Ezio Melotti <ezio.melo...@gmail.com> added the comment:
> 
> Even if they are not valid they still "eat" all the 4/5/6 bytes, so they 
> should be fixed too. I haven't see anything about these bytes in chapter 3 so 
> far, but there are at least two possibilities:
> 1) consider all the bytes in range F5-FD as invalid without looking for the 
> other bytes;
> 2) try to read the next 4/5/6 bytes and fail if they are no continuation 
> bytes.
> We can also look at what others do (e.g. browsers and other languages).

By marking those entries as 0 in the length table, they would only
use one byte, however, compared to the current state, that would
produce more replacement code points in the output, so perhaps applying
the same logic as for the other sequences is a better strategy.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8271>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to