I really don't understand why you emphasize so much on
case-/accent-insensitivity, and say UNICODE is ok, as we can easily
demonstrate the root of the problem of multi-level and multi-segment
with it.

Yes, there is inconsistencies with insensitive with/without unique, but
root of the problem (non-interleaved keys) is better demonstrated with a
case-/accent-sensitive collation.


-- Natural order we expect. Case is less important than base letter.

SQL> create table t1 (c1 varchar(2) character set utf8 collate unicode);
SQL> insert into t1 values ('a1');
SQL> insert into t1 values ('a2');
SQL> insert into t1 values ('A1');
SQL> insert into t1 values ('A2');
SQL>
SQL>
SQL> select * from t1 order by c1;

C1
========
a1
A1
a2
A2


-- But when string is separated in multiple columns, case become more
important than we naturally expect.

SQL> create table t2 (c1 varchar(1) character set utf8 collate unicode,
n1 integer);
SQL> insert into t2 values ('a', 1);
SQL> insert into t2 values ('a', 2);
SQL> insert into t2 values ('A', 1);
SQL> insert into t2 values ('A', 2);
SQL>
SQL> select * from t2 order by c1, n1;

C1               N1
====== ============
a                 1
a                 2
A                 1
A                 2


Adriano


On 18/10/2018 14:37, nonomura wrote:
> Hi there,
>
> I would like to ask someone who can check and correct if bug was found
> to see the relating part of source code regarding sorting on UTF8 +
> ICU collations.
>
> The symptom and what I tested and confirmed have commented in
> CORE-5940. The following is the summary:
>
> 1. UNICODE_CI and UNICODE_CI_AI are falsely working as if requested by
> collate UNICODE even in ordering by multiple columns.
> 2. As for collate UNICODE, the results were fine in any combination.
> 3. If the ordering combination was of unique index, the sort is
> correctly done in any combination also for UNICODE_CI and UNICODE_CI_AI.
> 4. Firebird uses ICU's sort key for collate UNICODE, UNICODE_CI,
> UNICODE_CI_AI.
> 5. The sort key is composed of three buckets: Body, Case, Accent.
> 6. The sort key generated by ICU library for UNICODE_CI_AI is only the
> body part and for UNICODE_CI the body and the case.
> 7. In collate UNICODE_CI and CI_AI, Firebird saves and obviously
> compares the three buckets falsely if the ordering combination is not
> of unique index.
>
> Therefore the problem seems to reside in either following A or B:
>
> A) In sorting, "compare function" must select the buckets to compare
> according to its collate property.
> B) If saved sort key is supposed to be fully compared in the function
> regardless of the collate property, then collate property must be
> considered in creating the sort key.
>
> Thanks in advance.
> Regards,
> Hiro
>
>
>
> Firebird-Devel mailing list, web interface at
> https://lists.sourceforge.net/lists/listinfo/firebird-devel



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to