[Bug 168225] Cannot sort in Tibetan / Dzongkha alphabetical order

bugzilla-daemon Mon, 01 Sep 2025 07:44:17 -0700

https://bugs.documentfoundation.org/show_bug.cgi?id=168225


--- Comment #3 from Ming Hua <[email protected]> ---
(In reply to Elie Roux from comment #2)
> I can answer any question on the order if needed.
I don't think we have any Tibetan script expert here on Bugzilla.  So I'll ask
a few questions that may seem obvious to you, but are actually hard for me as a
non-user, and I hope your answer would help other QA people and developers as
well.

You listed three collation rules in comment #0:

> CLDR has collation rules for Tibetan and Dzongkha:
> https://github.com/unicode-org/cldr/blob/main/common/collation/ (bo.xml and
> dz.xml)
> 
> LibreOffice has collation rules for Dzongkha:
> https://github.com/LibreOffice/core/blob/
> 4efd03d69ac7f6ae463aa56cea6f0e80f289f6e3/i18npool/source/collator/data/
> dz_charset.txt
> 
> The GLibC also has implemented the rules:
> https://sourceware.org/bugzilla/show_bug.cgi?id=21547

Are the character orders in them correct from your respective? Do they give the
same collation orders?

Regarding the three strings in your example (I added their Unicode codepoints):

ཀ་ (U+0F40 U+0F0B)
སྐ་ (U+0F66 U+0F90 U+0F0B)
ས་ (U+0F66 U+0F0B)

I see that both ཀ and ས in the 30-consonant list for Tibetan (I am Chinese so
it's easy for me to search for information about Tibetan, if Dzongkha is
somehow different, let me know), but སྐ is not on the list, and involves the
U+0F90 (SUBJOINED LETTER KA) character.  Is there some general sorting rule for
strings with subjoined letters?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 168225] Cannot sort in Tibetan / Dzongkha alphabetical order

Reply via email to