Follow-up Comment #10, bug #40720 (group groff): A more practical point implicating the UTF-8 migration is that right now we accept as identifiers code point sequences that are invalid as UTF-8 are presently accepted.
Here's an example from our friend bug #67734 again.
printf '.nr \311l 57\n.tm \\n(\311l\n'
(That's "Él".)
This is invalid UTF-8 because \311 starts a multibyte sequence; its high bit
is set and therefore this byte must be followed by at least one more byte with
its high bit set.
Bug #67734 reasons that we might as well start rejecting those things now,
because when we eventually land direct reading of UTF-8 in GNU _troff_, we'll
be doing so at that time anyway.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?40720>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
