Hans Aberg <haberg dash 1 at telia dot com> wrote: >>> However I wonder what would be the effect of D80 in UTF-32: is >>> <0xFFFFFFFF> a valid "32-bit string" ? >> >> The value 0xFFFFFFFF cannot appear in a UTF-32 string. Therefore it >> cannot represent a unit of encoded text in a UTF-32 string. > > Even though the values with highest bit set are not a part of original > UTF-32, it can easily be extended also to original UTF-8, which may be > simpler to implement.
"Original UTF-8," regardless of where defined, only ever encoded scalar values up to 0x7FFFFFFF. See, for example, RFC 2279. -- Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸

