On 10/03/2012 01:37 PM, Dmitry Olshansky wrote:
> On 03-Oct-12 23:56, Ali Çehreli wrote:

> If we are talking about the order then this is the way to go:
> http://unicode.org/reports/tr10/

Thank you. I wasn't aware of that long read. :)

>> struct Order
>> {
>> int base;
>> int accent;
>> int cased;
>> }
>>
>> (Of course opCmp() cannot return that type. :( )
>>
>> The idea is that only the application knows what type of comparison
>> makes sense.
>
> So instead library does all of them ? Ouch.. I'm not sure I got the idea.

The idea was that there would be AlphabetChar and AlphabetString that knew about what writing system that they belonged to: AlphabetChar!en, AlphabetChar!tr, etc.

For example, while letter ç is a distinct letter in the Turkish alphabet, it is an accented form of c in most Latin-based alphabets. That affects the 'base' member above. On the other hand, â is an accented 'a' both in the Turkish and the Latin-based alphabets. So the 'base' comparison for â and a would be the same.

Collation takes the alphabet into account. Although AlphabetChar!en is not compatible with AlphabetChar!tr, they can be forced to be compared according to the collation information of any alphabet.

So, that experimental library provides a number of alphabets with their own collation orders. I see now that the library should have supported the Unicode document that you have linked above. I will do some reading. :)

Ali

Reply via email to