To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=86516
------- Additional comments from [EMAIL PROTECTED] Fri Feb 29 04:58:23 +0000 2008 ------- KHONG wrote: << I picked 2 strings from attached doc. 1. ཀ་ཀྲ་ལ། (knife) 2. ཀ་ཀྲ་ལ (noun knife: ཚོད་བསྲེ་ཚུ་བཏོག་ནི་གི་ཀ་ཀྲ་ལ་ “a knife to cut vegetables”) Dzongkha_CorrectCollation-2.odt says 1 should be before 2, while Dzongkha_FaultyCollation.odt says 2 is before 1. The hex value for these 2 strings are 1. \u0f40\u0f0b\u0f40\u0fb2\u0f0b\u0f63\u0f0d 2. \u0f40\u0f0b\u0f40\u0fb2\u0f0b\u0f63 >From taloring data, I could not see why 1 should be before 2. >> The difference between these two particular strings and hence their relative ordering is inconsequntial \u0f0d is a punctuation mark indicating the end of a phrase - some dictionaries put it after each head word and some omit it. So \u0f0d is probably an ignorable. ============ A better example.... OOo 2.3.1 collates the following Dzongkha strings like this: དཀར་ཆག \u0F51\u0F40\u0F62\u0F0B\u0F46\u0F42 བཀྲ་ཤིས \u0F56\u0F40\u0FB2\u0F0B\u0F64\u0F72\u0F66 རྐོང་ནད \u0F62\u0F90\u0F7C\u0F44\u0F0B\u0F53\u0F51 སྐབས \u0F66\u0F90\u0F56\u0F66 སྐྱ་ཁབ \u0F66\u0F90\u0FB1\u0F0B\u0F41\u0F56 སྐུ \u0F66\u0F90\u0F74 བསྐྱར \u0F56\u0F66\u0F90\u0FB1\u0F62 མཁན་པོ \u0F58\u0F41\u0F53\u0F0B\u0F54\u0F7C དགའ \u0F51\u0F42\u0F60 དག \u0F51\u0F42 ཀ་ཀྲ་ལ \u0F40\u0F0B\u0F40\u0FB2\u0F0B\u0F63 ཀྲེབ་ཀྲེམ \u0F40\u0FB2\u0F7A\u0F56\u0F0B\u0F40\u0FB2\u0F7A\u0F58 ཁ་བཀལ \u0F41\u0F0B\u0F56\u0F40\u0F63 བ \u0F56 མ \u0F58 ར \u0F62 =============================================== The correct ordering of thse strings should be: ཀ་ཀྲ་ལ \u0F40\u0F0B\u0F40\u0FB2\u0F0B\u0F63 ཀྲེབ་ཀྲེམ \u0F40\u0FB2\u0F7A\u0F56\u0F0B\u0F40\u0FB2\u0F7A\u0F58 དཀར་ཆག \u0F51\u0F40\u0F62\u0F0B\u0F46\u0F42 བཀྲ་ཤིས \u0F56\u0F40\u0FB2\u0F0B\u0F64\u0F72\u0F66 རྐོང་ནད \u0F62\u0F90\u0F7C\u0F44\u0F0B\u0F53\u0F51 རྐྱང་ལོར \u0F62\u0F90\u0FB1\u0F44\u0F0B\u0F63\u0F7C\u0F62 སྐབས \u0F66\u0F90\u0F56\u0F66 སྐུ \u0F66\u0F90\u0F74 སྐྱ་ཁབ \u0F66\u0F90\u0FB1\u0F0B\u0F41\u0F56 བསྐྱར \u0F56\u0F66\u0F90\u0FB1\u0F62 ཁ་བཀལ \u0F41\u0F0B\u0F56\u0F40\u0F63 མཁན་པོ \u0F58\u0F41\u0F53\u0F0B\u0F54\u0F7C དགའ \u0F51\u0F42\u0F60 དག \u0F51\u0F42 བ \u0F56 མ \u0F58 ར \u0F62 ===================== Which is very different. Please see: Dzongkha Collation Chart <http://developer.mimer.com/charts/dzongkha.htm> This illustrates a tailoring for Dzongkha (& Tibetan) which produces correct results. (Same as Dzongkha & Tibetan Dictionaries). You will see that Dzongkha & Tibetan collation is complicated by the presence in many words of up to prefix two characters before the root letter which is the primary sort key. Words with the same root letter (e.g. "ka" \u0F40 or \u0f90) having no prefix e.g. ཀ་ཀྲ་ལ \u0F40\u0F0B\u0F40\u0FB2\u0F0B\u0F63 should collate *before* words with the same root letter but with a prefix e.g དཀར་ཆག \u0F51\u0F40\u0F62\u0F0B\u0F46\u0F42 (prefix "da" \u0F51) or རྐྱང་ལོར \u0F62\u0F90\u0FB1\u0F44\u0F0B\u0F63\u0F7C\u0F62 (prefix "ra" \u0F62) ======================================================================== To understand Dzongkha & Tibetan ordering it may also be helpful to look at the two charts: <http://chris.fynn.googlepages.com/tibetanlettercombinations> <http://chris.fynn.googlepages.com/tibetan_prefixes> Which illustrate the standard letter combinations and prefixes occuring in normal Dzongkha & Tibetan words. Many thanks for looking into this issue Hope the above clarifies the just what the issue is. regards - Chris Fynn (National Library of Bhutan) --------------------------------------------------------------------- Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
