Mustafa Jabbar inquired: > Please also inform me about what will be the sorting for Bangla. > Thanks and regards
The Unicode Standard is *not* a sorting standard -- nor is any character encoding. The reason why it might seem to be, on occasion, is that there is a long history of people fiddling with the exact details of a character encoding, to attempt to get them in orders so that dumb binary comparison algorithms will produce the "correct" results for pairs of strings using that particular encoding. The general consensus is, however, that it is impossible to accomplish meaningful linguistic sorting simply by tinkering around with the character encoding tables. See Section 5.16 of the standard for a brief discussion of this issue. For the related collation standard, see, instead: http://www.unicode.org/reports/tr10/ That is the Unicode Collation Algorithm (UCA). That standard explains how to accomplish culturally expected sorting and defines an algorithm and default table to use for it. That *still* is not the answer for how Bangla will be sorted, however. One has to make *use* of the Unicode Collation Algorithm and then tailor the table accordingly until you produce the results desired. So the question which should be asked is: Has anyone produced a UCA-based collation for Bangla, and if so, what behavior does it have for sorting Bangla data? See also the discussion of sorting issues for Indic languages in Cathy Wissink's technical note: http://www.unicode.org/notes/tn1/ --Ken

