On Friday, 1 December 2017 at 18:31:46 UTC, Jonathan M Davis
wrote:
On Friday, December 01, 2017 09:49:08 Steven Schveighoffer via
Digitalmars-d wrote:
On 12/1/17 7:26 AM, Patrick Schluter wrote:
> On Friday, 1 December 2017 at 06:07:07 UTC, Patrick Schluter
> wrote:
>> isolated codepoints.
>
> I meant isolated code-units, of course.
Hehe, it's impossible for me to talk about code points and
code units without having to pause and consider which one I
mean :)
What, you mean that Unicode can be confusing? No way! ;)
LOL. I have to be careful with that too. What bugs me even more
though is that the Unicode spec talks about code points being
characters, and then talks about combining characters for
grapheme clusters - and this in spite of the fact that what
most people would consider a character is a grapheme cluster
and _not_ a code point. But they presumably had to come up with
new terms for a lot of this nonsense, and that's not always
easy.
Regardless, what they came up with is complicated enough that
it's arguably a miracle whenever a program actually handles
Unicode text 100% correctly. :|
- Jonathan M Davis
And dealing with that complexity can often introduce bugs in
their own right, because it's hard to get right. That's why
sometimes it's easy just to simplify things and to exclude
certain ways of looking at the string.