[EMAIL PROTECTED] writes: > Please also inform me about what will be the sorting for Bangla. > Thanks and regards > Mustafa Jabbar
Same response: you don't sort on codepoints but using UCA and the default Unicode collation elements table (DUCET) published in Unicode charts, but compiled for example as a text file containing collation rules (see UCARules.txt in ICU) or as a complete conversion table from codepoints to collation weights. For Bengla, the DUCET will certainly not be enough to match all your needs, and you'll probably need to tailor the collation order using expansion rules and swaps with more collation levels than what is shown in DUCET (just just documents 3 levels before the codepoint order: primary, secondary, ternary). It will be however simpler than sorting Thai with the logical (phonetic) order, which requires a preprocessing to find grapheme clusters and syllables with a dictionnary, unless you prefer to sort simply on the visual order I confess that I have not attempted to do any sorting of Thai data. If I had to do that I would need to use a complete implementation found in ICU (but ICU is quite large for some projects). __________________________________________________________________ << ella for Spam Control >> has removed Spam messages and set aside Newsletters for me You can use it too - and it's FREE! http://www.ellaforspam.com
<<attachment: winmail.dat>>

