Re: Replacing tango.text.Ascii.isearch

Ali Çehreli via Digitalmars-d-learn Tue, 25 Oct 2022 23:11:26 -0700

On 10/25/22 22:49, Siarhei Siamashka wrote:

> Unicode is significantly simpler than a set of various
> incompatible 8-bit encodings


Strongly agreed.

> I'm surely
> able to ignore the peculiarities of modern Turkish Unicode

The problem with Unicode is its main aim of allowing characters ofmultiple writing systems in the same text. When multiple writing systemsare in play, conflicts and ambiguities will appear.


> and wait for
> the other people to come up with a solution for D language if they
> really care.

I solved my problem by writing an Alphabet hierarchy in the past. Idon't like that code but it still works:



https://bitbucket.org/acehreli/ddili/src/4c0552fe8352dfe905c9734a57d84d36ce4ed476/src/alphabet.d#lines-50

It handles capitalization, ordering, etc. I use it when preparing theIndex section of the Turkish edition of "Programming in D":


  http://ddili.org/ders/d/ix.html

One of the ambiguities is what came up on this thread: Should a wordthat starts with I (capital i) be listed under I (because it's Turkish)or under İ (because it's English)? So far, I am lucky because the onlyword that starts with I happens to be the English "IDE", so it goesunder i (which appears as İ in the Turkish edition) and would make senseto a Turkish reader because a Turkish reader might (really?) accept itas the capital of ide.

It's confusing but it seems to work. :) It doesn't matter. Life isimperfect and things will somehow work in the end.

Ali

Re: Replacing tango.text.Ascii.isearch

Reply via email to