Re: Unicode String Models

Henri Sivonen via Unicode Wed, 12 Sep 2018 22:11:29 -0700

On Wed, Sep 12, 2018 at 11:37 AM Hans Åberg via Unicode
<unicode@unicode.org> wrote:
> The idea is to extend Unicode itself, so that those bytes can be represented 
> by legal codepoints.


Extending Unicode itself would likely create more problems that it
would solve. Extending the value space of Unicode scalar values would
be extremely disruptive for systems whose design is deeply committed
to the current definitions of UTF-16 and UTF-8 staying unchanged.
Assigning a scalar value within the current Unicode scalar value space
to currently malformed bytes would have the problem of those scalar
values losing information whether they came from malformed bytes or
the well-formed encoding of those scalar values.

It seems better to let applications that have use cases that involve
representing non-Unicode values to use a special-purpose extension on
their own.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/

Re: Unicode String Models

Reply via email to