On 10/25/22 22:49, Siarhei Siamashka wrote:

> Unicode is significantly simpler than a set of various
> incompatible 8-bit encodings

Strongly agreed.

> I'm surely
> able to ignore the peculiarities of modern Turkish Unicode

The problem with Unicode is its main aim of allowing characters of multiple writing systems in the same text. When multiple writing systems are in play, conflicts and ambiguities will appear.

> and wait for
> the other people to come up with a solution for D language if they
> really care.

I solved my problem by writing an Alphabet hierarchy in the past. I don't like that code but it still works:


https://bitbucket.org/acehreli/ddili/src/4c0552fe8352dfe905c9734a57d84d36ce4ed476/src/alphabet.d#lines-50

It handles capitalization, ordering, etc. I use it when preparing the Index section of the Turkish edition of "Programming in D":

  http://ddili.org/ders/d/ix.html

One of the ambiguities is what came up on this thread: Should a word that starts with I (capital i) be listed under I (because it's Turkish) or under İ (because it's English)? So far, I am lucky because the only word that starts with I happens to be the English "IDE", so it goes under i (which appears as İ in the Turkish edition) and would make sense to a Turkish reader because a Turkish reader might (really?) accept it as the capital of ide.

It's confusing but it seems to work. :) It doesn't matter. Life is imperfect and things will somehow work in the end.

Ali

Reply via email to