[ 
https://issues.apache.org/jira/browse/DERBY-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482363#comment-13482363
 ] 

Knut Anders Hatlen edited comment on DERBY-5959 at 10/23/12 3:46 PM:
---------------------------------------------------------------------

Java 8 fixes a bug in Thai locale ( 
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6755060 ), and that causes a 
similar issue on upgrade with territory-based collation.

Example, create database in Java 7:

connect 'jdbc:derby:thaidb;territory=th;collation=TERRITORY_BASED;create=true';
create table t(x int, c char(1) unique not null);
insert into t values (1, '๎'), (2, '์');

(The character in row 1 is \u0e4e, and the one in row 2 is \u0e4c.)

Update the database in Java 8, which has different ordering:

connect 'jdbc:derby:thaidb';
insert into t values (3, '๎');

(The character is \u0e4e.)

The table contents now are:

ij> select * from t;
X          |C
-------------
1          |๎
2          |์
3          |๎

3 rows selected

The value of C is identical in row 1 and row 3, even though there is a UNIQUE 
constraint on the column.

[Comment edited: Added spaces around URL to prevent JIRA from garbling it.]
                
      was (Author: knutanders):
    Java 8 fixes a bug in Thai locale 
(http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6755060), and that causes a 
similar issue on upgrade with territory-based collation.

Example, create database in Java 7:

connect 'jdbc:derby:thaidb;territory=th;collation=TERRITORY_BASED;create=true';
create table t(x int, c char(1) unique not null);
insert into t values (1, '๎'), (2, '์');

(The character in row 1 is \u0e4e, and the one in row 2 is \u0e4c.)

Update the database in Java 8, which has different ordering:

connect 'jdbc:derby:thaidb';
insert into t values (3, '๎');

(The character is \u0e4e.)

The table contents now are:

ij> select * from t;
X          |C
-------------
1          |๎
2          |์
3          |๎

3 rows selected

The value of C is identical in row 1 and row 3, even though there is a UNIQUE 
constraint on the column.
                  
> Territory-based collation is not robust against changes in the collation rules
> ------------------------------------------------------------------------------
>
>                 Key: DERBY-5959
>                 URL: https://issues.apache.org/jira/browse/DERBY-5959
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.10.0.0
>            Reporter: Knut Anders Hatlen
>
> When accessing a database with territory-based collation, Derby will use the 
> collation rules of the collator returned by 
> Collator.getInstance(databaseLocale). However, there is no guarantee that 
> those rules are consistent across different JVM vendors and versions. This 
> means that the ordering could vary, and inconsistencies could sneak into the 
> indexes.
> One example is that Oracle's JDK changed the collation rules for Turkish 
> between Java 5 and Java 6, so if you run the following script
> connect 
> 'jdbc:derby:memory:db;territory=tr_TR;collation=TERRITORY_BASED;create=true';
> create table t(c char(2));
> insert into t values 'ıa', 'Ia', 'ia', 'İa', 'ıb', 'Ib', 'ib', 'İb';
> select * from t order by c;
> you'll get different results on Java 5 and on Java 6 and later.
> Java 5 will order the results like this:
> ij> select * from t order by c;
> C   
> ----
> ıa  
> Ia  
> ia  
> İa  
> ıb  
> Ib  
> ib  
> İb  
> 8 rows selected
> Java 6 and later order them like this like this:
> ij> select * from t order by c;
> C   
> ----
> ıa  
> Ia  
> ıb  
> Ib  
> ia  
> İa  
> ib  
> İb  
> 8 rows selected

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to