When it comes to methods operating on
buffers there's always the tension between viewing the buffer as
text elements vs. as data elements. For some purposes, from error
detection to data cleanup you need to be able to treat the buffer
as data elements. For many
Le 16/10/2015 02:22, Don Osborn a écrit :
I was surprised to learn of continued reference to and presumably use
of 8-bit fonts modified two decades ago for the extended Latin
alphabets of Malian languages, and wondered if anyone has similar
observations in other countries. Or if there have
2015-10-20 2:07 GMT+02:00 Richard Wordingham <
richard.wording...@ntlworld.com>:
> Now, as we know, UTF-32 does not handle the full range of Unicode code
> points;
??? All valid UTFs handle the full range of valid Unicode code points. This
includes UTF-32 as well as UTF-16 and UTF-8 (and their
> there is never any excuse for software to create unpaired surrogates, or
any other sort of invalid code unit sequences
First off, it depends on when one is encountered. They are invalid in
UTF16, but are permitted in a Unicode 16-bit string.
But more fundamentally, there may not be "excuses"
> A good Unicode string in a programming language
Yes, that would be great, no question. It isn't, however, the case in most
programming languages (measured by the amount of software written in them).
The original question that started these threads was how to handle isolated
surrogates. If you
Le mercredi, 21 octobre 2015 à 02:23, Mark Davis ☕️ a écrit :
> But more fundamentally, there may not be "excuses" for such software, but it
> happens anyway. Pretending it doesn't, makes for unhappy customers. For
> example, you don't want to be throwing an exception when one is encountered,
>
6 matches
Mail list logo