Dominikus Dittes Scherkl wrote:

> For UTF-8 simply use the now forbidden sequences starting with 0xFC or
> 0xFD (maybe not 0xFE and 0xFF as this sometimes is missused for
> encoding-detection).
> For ease of description let's say they introduce a 6-byte sequence,
> the first encodes only one bit (0xFC or 0xFD), the five follow-up
> bytes 6bit each, together using up the full 31bit range of UTF-32

We don’t need to invent or re-describe any of this. Support for values up to 
U+7FFFFFFF was part of the original design of UTF-8, as can still be seen here:

https://datatracker.ietf.org/doc/html/rfc2279#section-2

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


Reply via email to