I really don't understand why you emphasize so much on case-/accent-insensitivity, and say UNICODE is ok, as we can easily demonstrate the root of the problem of multi-level and multi-segment with it.
Yes, there is inconsistencies with insensitive with/without unique, but root of the problem (non-interleaved keys) is better demonstrated with a case-/accent-sensitive collation. -- Natural order we expect. Case is less important than base letter. SQL> create table t1 (c1 varchar(2) character set utf8 collate unicode); SQL> insert into t1 values ('a1'); SQL> insert into t1 values ('a2'); SQL> insert into t1 values ('A1'); SQL> insert into t1 values ('A2'); SQL> SQL> SQL> select * from t1 order by c1; C1 ======== a1 A1 a2 A2 -- But when string is separated in multiple columns, case become more important than we naturally expect. SQL> create table t2 (c1 varchar(1) character set utf8 collate unicode, n1 integer); SQL> insert into t2 values ('a', 1); SQL> insert into t2 values ('a', 2); SQL> insert into t2 values ('A', 1); SQL> insert into t2 values ('A', 2); SQL> SQL> select * from t2 order by c1, n1; C1 N1 ====== ============ a 1 a 2 A 1 A 2 Adriano On 18/10/2018 14:37, nonomura wrote: > Hi there, > > I would like to ask someone who can check and correct if bug was found > to see the relating part of source code regarding sorting on UTF8 + > ICU collations. > > The symptom and what I tested and confirmed have commented in > CORE-5940. The following is the summary: > > 1. UNICODE_CI and UNICODE_CI_AI are falsely working as if requested by > collate UNICODE even in ordering by multiple columns. > 2. As for collate UNICODE, the results were fine in any combination. > 3. If the ordering combination was of unique index, the sort is > correctly done in any combination also for UNICODE_CI and UNICODE_CI_AI. > 4. Firebird uses ICU's sort key for collate UNICODE, UNICODE_CI, > UNICODE_CI_AI. > 5. The sort key is composed of three buckets: Body, Case, Accent. > 6. The sort key generated by ICU library for UNICODE_CI_AI is only the > body part and for UNICODE_CI the body and the case. > 7. In collate UNICODE_CI and CI_AI, Firebird saves and obviously > compares the three buckets falsely if the ordering combination is not > of unique index. > > Therefore the problem seems to reside in either following A or B: > > A) In sorting, "compare function" must select the buckets to compare > according to its collate property. > B) If saved sort key is supposed to be fully compared in the function > regardless of the collate property, then collate property must be > considered in creating the sort key. > > Thanks in advance. > Regards, > Hiro > > > > Firebird-Devel mailing list, web interface at > https://lists.sourceforge.net/lists/listinfo/firebird-devel Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel