On 12/1/2024 9:09 PM, David Starner via Unicode wrote:
On Sun, Dec 1, 2024 at 7:54 PM Dominikus Dittes Scherkl via Unicode
<[email protected]> wrote:
But in automatic text processing the old form is simply a bug that needs
to be fixed. The new form has to be the "default" - otherwise
implementations will proliferate this bug forever.
Various systems take for granted that case folding is stable.
Very much agreed on that one. Usually in the context of "identifiers"
and not in free text.
Differences in how Unicode data is interpreted has open security holes
in systems, and while this isn't particularly likely with this change,
it is possible, which is part of the reason case-folding is guaranteed
to be stable. Such a change can confuse case-insensitive filesystems,
or change the interpretation of code in case-insensitive filesystems.
The automated default isn't going to change, and German is going to
have to join Turkish in that purely default case-conversion just
doesn't work for them.
Again, it would help to mentally change from "default" to some other
term, like the "InvariantCulture" terminology used by .NET, for example.
By "default", if I start editing a document, I should not have to worry
about getting a deficient case mapping/case conversion implementation
just because I'm using the "wrong" language.
Likewise, by default, I should never get the locale-dependent case
conversion invoked when accessing file systems or domain names.
These are different "defaults".
A./