To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=86516





------- Additional comments from [EMAIL PROTECTED] Fri Feb 29 04:58:23 +0000 
2008 -------
KHONG wrote:

<< I picked 2 strings from
attached doc.

1. ཀ་ཀྲ་ལ། (knife)
2. ཀ་ཀྲ་ལ (noun knife: ཚོད་བསྲེ་ཚུ་བཏོག་ནི་གི་ཀ་ཀྲ་ལ་ “a knife to cut 
vegetables”)

Dzongkha_CorrectCollation-2.odt says 1 should be before 2, while
Dzongkha_FaultyCollation.odt says 2 is before 1. 

The hex value for these 2 strings are

1. \u0f40\u0f0b\u0f40\u0fb2\u0f0b\u0f63\u0f0d
2. \u0f40\u0f0b\u0f40\u0fb2\u0f0b\u0f63

>From taloring data, I could not see why 1 should be before 2. 
>>

The difference between these two particular strings and hence their relative
ordering is inconsequntial \u0f0d is a punctuation mark indicating the end of a
phrase - some dictionaries put it after each head word and some omit it. 

So \u0f0d is probably an ignorable.

============

A better example....

OOo 2.3.1  collates the following Dzongkha strings like this:

དཀར་ཆག   \u0F51\u0F40\u0F62\u0F0B\u0F46\u0F42
བཀྲ་ཤིས    \u0F56\u0F40\u0FB2\u0F0B\u0F64\u0F72\u0F66
རྐོང་ནད    \u0F62\u0F90\u0F7C\u0F44\u0F0B\u0F53\u0F51
སྐབས    \u0F66\u0F90\u0F56\u0F66
སྐྱ་ཁབ    \u0F66\u0F90\u0FB1\u0F0B\u0F41\u0F56
སྐུ      \u0F66\u0F90\u0F74
བསྐྱར    \u0F56\u0F66\u0F90\u0FB1\u0F62
མཁན་པོ   \u0F58\u0F41\u0F53\u0F0B\u0F54\u0F7C
དགའ    \u0F51\u0F42\u0F60
དག     \u0F51\u0F42
ཀ་ཀྲ་ལ    \u0F40\u0F0B\u0F40\u0FB2\u0F0B\u0F63
ཀྲེབ་ཀྲེམ    \u0F40\u0FB2\u0F7A\u0F56\u0F0B\u0F40\u0FB2\u0F7A\u0F58
ཁ་བཀལ   \u0F41\u0F0B\u0F56\u0F40\u0F63
བ      \u0F56
མ      \u0F58
ར      \u0F62  

===============================================

The correct ordering of thse strings should be: 

ཀ་ཀྲ་ལ    \u0F40\u0F0B\u0F40\u0FB2\u0F0B\u0F63
ཀྲེབ་ཀྲེམ   \u0F40\u0FB2\u0F7A\u0F56\u0F0B\u0F40\u0FB2\u0F7A\u0F58
དཀར་ཆག   \u0F51\u0F40\u0F62\u0F0B\u0F46\u0F42
བཀྲ་ཤིས    \u0F56\u0F40\u0FB2\u0F0B\u0F64\u0F72\u0F66
རྐོང་ནད    \u0F62\u0F90\u0F7C\u0F44\u0F0B\u0F53\u0F51
རྐྱང་ལོར    \u0F62\u0F90\u0FB1\u0F44\u0F0B\u0F63\u0F7C\u0F62
སྐབས     \u0F66\u0F90\u0F56\u0F66
སྐུ        \u0F66\u0F90\u0F74
སྐྱ་ཁབ    \u0F66\u0F90\u0FB1\u0F0B\u0F41\u0F56
བསྐྱར     \u0F56\u0F66\u0F90\u0FB1\u0F62
ཁ་བཀལ   \u0F41\u0F0B\u0F56\u0F40\u0F63
མཁན་པོ   \u0F58\u0F41\u0F53\u0F0B\u0F54\u0F7C
དགའ   \u0F51\u0F42\u0F60
དག    \u0F51\u0F42
བ     \u0F56
མ      \u0F58
ར      \u0F62 
=====================
Which is very different.


Please see: 
Dzongkha Collation Chart
<http://developer.mimer.com/charts/dzongkha.htm>
This illustrates a tailoring for Dzongkha (& Tibetan) which produces correct
results. (Same as Dzongkha & Tibetan Dictionaries).

You will see that Dzongkha & Tibetan collation is complicated by the presence in
many words of up to prefix two characters before the root letter which is the
primary sort key.

Words with the same root letter (e.g. "ka" \u0F40 or  \u0f90) having no prefix 
e.g.  ཀ་ཀྲ་ལ    \u0F40\u0F0B\u0F40\u0FB2\u0F0B\u0F63

should collate *before* words with the same root letter but with a prefix

 e.g དཀར་ཆག   \u0F51\u0F40\u0F62\u0F0B\u0F46\u0F42 (prefix "da" \u0F51)

or
རྐྱང་ལོར    \u0F62\u0F90\u0FB1\u0F44\u0F0B\u0F63\u0F7C\u0F62  (prefix  "ra" 
\u0F62)

========================================================================

To understand Dzongkha & Tibetan ordering it may also be helpful to look at the
two charts:
<http://chris.fynn.googlepages.com/tibetanlettercombinations>
<http://chris.fynn.googlepages.com/tibetan_prefixes>

Which illustrate the standard letter combinations and prefixes occuring in
normal Dzongkha & Tibetan words.

Many thanks for looking into this issue

Hope the above clarifies the just what the issue is.

regards

- Chris Fynn

(National Library of Bhutan)

 


---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to