Hi there,

I would like to ask someone who can check and correct if bug was found to see the relating part of source code regarding sorting on UTF8 + ICU collations.

The symptom and what I tested and confirmed have commented in CORE-5940. The following is the summary:

1. UNICODE_CI and UNICODE_CI_AI are falsely working as if requested by collate UNICODE even in ordering by multiple columns.
2. As for collate UNICODE, the results were fine in any combination.
3. If the ordering combination was of unique index, the sort is correctly done in any combination also for UNICODE_CI and UNICODE_CI_AI. 4. Firebird uses ICU's sort key for collate UNICODE, UNICODE_CI, UNICODE_CI_AI.
5. The sort key is composed of three buckets: Body, Case, Accent.
6. The sort key generated by ICU library for UNICODE_CI_AI is only the body part and for UNICODE_CI the body and the case. 7. In collate UNICODE_CI and CI_AI, Firebird saves and obviously compares the three buckets falsely if the ordering combination is not of unique index.

Therefore the problem seems to reside in either following A or B:

A) In sorting, "compare function" must select the buckets to compare according to its collate property. B) If saved sort key is supposed to be fully compared in the function regardless of the collate property, then collate property must be considered in creating the sort key.

Thanks in advance.
Regards,
Hiro



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to