> Date: Wed, 12 Sep 2018 00:13:52 +0200 > Cc: unicode@unicode.org > From: Hans Åberg via Unicode <unicode@unicode.org> > > It might be useful to represent non-UTF-8 bytes as Unicode code points. One > way might be to use a codepoint to indicate high bit set followed by the byte > value with its high bit set to 0, that is, truncated into the ASCII range. > For example, U+0080 looks like it is not in use, though I could not verify > this.
You must use a codepoint that is not defined by Unicode, and never will. That is what Emacs does: it extends the Unicode codepoint space beyond 0x10FFFF.