Dominikus Dittes Scherkl wrote: > For UTF-8 simply use the now forbidden sequences starting with 0xFC or > 0xFD (maybe not 0xFE and 0xFF as this sometimes is missused for > encoding-detection). > For ease of description let's say they introduce a 6-byte sequence, > the first encodes only one bit (0xFC or 0xFD), the five follow-up > bytes 6bit each, together using up the full 31bit range of UTF-32
We don’t need to invent or re-describe any of this. Support for values up to U+7FFFFFFF was part of the original design of UTF-8, as can still be seen here: https://datatracker.ietf.org/doc/html/rfc2279#section-2 -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org
