[
https://issues.apache.org/jira/browse/DERBY-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Knut Anders Hatlen updated DERBY-3975:
--------------------------------------
Attachment: derby-3975-1a.stat
derby-3975-1a.diff
The attached patch attempts to fix the problem by implementing a hashCode()
method in CollatorSQLChar, CollatorSQLVarchar, CollatorSQLLongvarchar and
CollatorSQLClob based on CollationKey.hashCode(). It also extends
CollationTest.compareAgrave() with a test case for SELECT DISTINCT, and makes
it test both CHAR and VARCHAR (previously it only tested VARCHAR).
CollationTest fails without the fix and passes with the fix. The test is based
on the fact that in the French locale, À (Unicode code point 00C0) is the same
as À ('A' + Unicode code point 0300), whereas they are different in UCS_BASIC.
I will start the regression tests now.
> SELECT DISTINCT may return duplicates with territory-based collation
> --------------------------------------------------------------------
>
> Key: DERBY-3975
> URL: https://issues.apache.org/jira/browse/DERBY-3975
> Project: Derby
> Issue Type: Bug
> Components: SQL
> Affects Versions: 10.4.2.0
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Attachments: AisBCollatorProvider.java, derby-3975-1a.diff,
> derby-3975-1a.stat
>
>
> I defined my own locale (en_US_aisb) where the collation rules said that a=b.
> When I tried queries with SELECT DISTINCT, they didn't always eliminate all
> duplicates. Here's an example:
> ij> connect
> 'jdbc:derby:db;create=true;territory=en_US_aisb;collation=TERRITORY_BASED';
> ij> create table t (x varchar(10));
> 0 rows inserted/updated/deleted
> ij> insert into t values 'a','b','abba','baab','ABBA';
> 5 rows inserted/updated/deleted
> ij> select distinct * from t;
> X
> ----------
> ABBA
> b
> a
> abba
> 4 rows selected
> ij> select distinct * from t order by x;
> X
> ----------
> a
> abba
> ABBA
> 3 rows selected
> The first query did eliminate the duplicate "abba"/"baab", but it did not
> eliminate the duplicate "a"/"b". When an ORDER BY clause was added (the
> second query), all the duplicates were eliminated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.