On Wed, Sep 12, 2018 at 11:37 AM Hans Åberg via Unicode <unicode@unicode.org> wrote: > The idea is to extend Unicode itself, so that those bytes can be represented > by legal codepoints.
Extending Unicode itself would likely create more problems that it would solve. Extending the value space of Unicode scalar values would be extremely disruptive for systems whose design is deeply committed to the current definitions of UTF-16 and UTF-8 staying unchanged. Assigning a scalar value within the current Unicode scalar value space to currently malformed bytes would have the problem of those scalar values losing information whether they came from malformed bytes or the well-formed encoding of those scalar values. It seems better to let applications that have use cases that involve representing non-Unicode values to use a special-purpose extension on their own. -- Henri Sivonen hsivo...@hsivonen.fi https://hsivonen.fi/