Hi,
We encountered a strange behavior in ordering a column using a created collation. Here are the experiments details: Experiment 1:- SQL File : PG_Exp_1.sql Actual Output : PG_Exp_1.out Created COLLATION : CREATE COLLATION test_coll ( provider = icu, locale = 'ja-u-kr-latn-digit'); 'SELECT' Queries : SELECT * FROM test_table ORDER BY value1 COLLATE test_coll, value2 COLLATE test_coll; SELECT * FROM test_table ORDER BY value2 COLLATE test_coll, value1 COLLATE test_coll; Expectation : All alphabets should come before all digits. Seen Behavior : Column 1 in the 'ORDER BY' i.e value1 of the first 'SELECT' and Column 1 in the 'ORDER BY' i.e value2 of the second 'SELECT' is giving the correct order. But Column 2 in the 'ORDER BY' i.e value2 in the first 'SELECT' and Column 2 in the 'ORDER BY' i.e value1 in the second 'SELECT' is NOT giving the correct order. Experiment 2:- SQL File : PG_Exp_2.sql Actual Output : PG_Exp_2.out Created 'COLLATION' : CREATE COLLATION test_coll ( provider = icu, locale = 'ja-u-kr-digit-latn'); 'SELECT' Queries : Same as 'Experiment 1'. Expectation : All digits should come before all alphabets. Seen Behavior : Matching with expectation. Column 1 in the 'ORDER BY' i.e value1 of the first 'SELECT' and Column 1 in the 'ORDER BY' i.e value2 of the second 'SELECT' is giving the correct order. And Column 2 in the 'ORDER BY' i.e value2 in the first 'SELECT' and Column 2 in the 'ORDER BY' i.e value1 in the second 'SELECT' is giving the correct order. We did debug 'Experiment 1' and we find that:- Whatever is the Column 1 in 'ORDER BY' gets correctly ordered, because it uses abbreviated sort optimization due to which its data datum gets converted to abbreviated datum using "varstr_abbrev_convert()" function, and then the comparator function selected is "ssup->comparator = ssup_datum_unsigned_cmp()" for sorting operation. But in case of column 2 in 'ORDER BY' (which is showing incorrect result for 'Experiment 1') does not use abbreviated sort optimization and here comparator function selected is "ssup->comparator = varlenafastcmp_locale" --> "strncoll_icu_utf8()", which appears, uses the third party ICU library function for comparison and does not work as expected. Need help in confirming why 'Experiment 1' is behaving as mentioned above - 1. If our expectation of 'Experiment 1' is wrong? 2. Bug in abbreviated sort optimization? 3. Bug in third party comparator function "strncoll_icu_utf8()"? 4. Any other aspects which we are missing? 5. Or everything appears good? PFA, the experiment files. Thanks & Regards, Nishant Sharma. EnterpriseDB, Pune, India.
PG_Exp_1.sql
Description: Binary data
PG_Exp_2.sql
Description: Binary data
PG_Exp_1.out
Description: Binary data
PG_Exp_2.out
Description: Binary data
