Am 17.01.2011 00:58, schrieb Andrei Alexandrescu:
On 1/16/11 3:20 PM, Michel Fortin wrote:
On 2011-01-16 14:29:04 -0500, Andrei Alexandrescu
<[email protected]> said:
But most strings don't contain combining characters or unnormalized
strings.
I think we should expect combining marks to be used more and more as our
OS text system and fonts start supporting them better. Them being rare
might be true today, but what do you know about tomorrow?
I don't think languages will acquire more diacritics soon. I do hope, of
course, that D applications gain more usage in the Arabic, Hebrew etc.
world.
So why does D use unicode anyway?
If you don't care about not-often used languages anyway, you could have
used UCS-2 like java. Or plain 8bit ISO-8859-* (the user can decide
which encoding he wants/needs).
You could as well say "we don't need to use dchar to represent a proper
code point, wchar is enough for most use cases and has fewer overhead
anyway".
I think it's reasonable to understand why I'm happy with the current
state of affairs. It is better than anything we've had before and
better than everything else I've tried.
It is indeed easy to understand why you're happy with the current state
of affairs: you never had to deal with multi-code-point character and
can't imagine yourself having to deal with them on a semi-frequent
basis.
Do you, and can you?
Other people won't be so happy with this state of affairs, but
they'll probably notice only after most of their code has been written
unaware of the problem.
They can't be unaware and write said code.
Fun fact: Germany recently introduced a new ID card and some of the
software that was developed for this and is used in some record sections
fucks up when a name contains diacritics.
I think especially when you're handling names (and much software does, I
think) it's crucial to have proper support for all kinds of chars.
Of course many programmers are not aware that, if Umlaute and ß works it
doesn't mean that all other kinds of strange characters work as well.
Cheers,
- Daniel