[jira] Commented: (DERBY-2967) Single character does not match high value unicode character with collation TERRITORY_BASED

Knut Anders Hatlen (JIRA) Thu, 18 Oct 2007 11:01:18 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535987
 ]


Knut Anders Hatlen commented on DERBY-2967:
-------------------------------------------

One more question: In the discussion above, I see that the special case where 
two strings of different length map to the same sequence of collation elements, 
has been discussed. What about two characters, c1 and c2, which have different 
Unicode codepoints, but map to the same sequence of collation elements. Should 
both c1 = c2 and c1 LIKE c2 be true? That's how it's implemented, but I'm not 
sure whether c1 LIKE c2 should be true or false. I haven't checked what the 
standard says (and I'm not sure I want to... ;) ), but it feels a bit strange 
that two different characters should be LIKE because they have the same 
collation elements, when 'aa' is not LIKE 'å' because they are not the same 
single character. I understand how we can split the character sequence 'aa' 
into the single characters 'a' and 'a'. I don't understand how we can take a 
single collation element 'aa' and split it into two separate collation elements 
'a' and 'a'. I'm sure the standard says it's correctly implemented, and I guess 
its wording will make it quite clear, perhaps even logical, that it has to be 
that way. I just wanted to double check that we had verified it...

> Single character does not match high value unicode character with collation 
> TERRITORY_BASED
> -------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2967
>                 URL: https://issues.apache.org/jira/browse/DERBY-2967
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.4.0.0
>            Reporter: Kathey Marsden
>            Assignee: Mamta A. Satoor
>         Attachments: DERBY2967_Oct11_07_diff.txt, 
> DERBY2967_Oct11_07_stat.txt, DERBY2967_offset_based_diff_Oct02_07.txt, 
> DERBY2967_offset_based_stat_Oct02_07.txt, fullcoll.out, 
> patch2_setOffset_fullcoll.out, patch2_with_setOffset_diff_Sep2007.txt, 
> patch2_with_setOffset_stat_Sep2007.txt, step1_iteratorbased_Sep1507_diff.txt, 
> step1_iteratorbased_Sep1507_stat.txt, temp_diff.txt, temp_stat.txt, 
> TestFrench.java, TestNorway.java
>
>
> With TERRITORY_BASED collation '_' does not match  the character \uFA2D.  It 
> is the same for english or norwegian. FOR collation UCS_BASIC it matches 
> fine.  Could you tell me if this is a bug?
> Here is a program to reproduce.
> import java.sql.*;
> public class HighCharacter {
>    public static void main(String args[]) throws Exception
>    {
>    System.out.println("\n Territory no_NO");
>    Class.forName("org.apache.derby.jdbc.EmbeddedDriver");
>    Connection conn = 
> DriverManager.getConnection("jdbc:derby:nordb;create=true;territory=no_NO;collation=TERRITORY_BASED");
>    testLikeWithHighestValidCharacter(conn);
>    conn.close();
>    System.out.println("\n Territory en_US");
>    conn = 
> DriverManager.getConnection("jdbc:derby:endb;create=true;territory=en_US;collation=TERRITORY_BASED");
>    testLikeWithHighestValidCharacter(conn);
>    conn.close();
>    System.out.println("\n Collation USC_BASIC");
>    conn = DriverManager.getConnection("jdbc:derby:basicdb;create=true");
>    testLikeWithHighestValidCharacter(conn);
>    }
> public static  void testLikeWithHighestValidCharacter(Connection conn) throws 
> SQLException {
>    Statement stmt = conn.createStatement();
>    try {
>    stmt.executeUpdate("drop table t1");
>    }catch (SQLException se)
>    {// drop failure ok.
>    }
>    stmt.executeUpdate("create table t1(c11 int)");
>    stmt.executeUpdate("insert into t1 values 1");
>  
>    // \uFA2D - the highest valid character according to
>    // Character.isDefined() of JDK 1.4;
>    PreparedStatement ps =
>    conn.prepareStatement("select 1 from t1 where '\uFA2D' like ?");
>      String[] match = { "%", "_", "\uFA2D" };
>    for (int i = 0; i < match.length; i++) {
>    System.out.println("select 1 from t1 where '\\uFA2D' like " + match[i]);
>    ps.setString(1, match[i]);
>    ResultSet rs = ps.executeQuery();
>    if( rs.next() && rs.getString(1).equals("1"))
>        System.out.println("PASS");
>    else          System.out.println("FAIL: no match");
>    rs.close();
>    }
>   }
> }
> Mamta made some comments on this issue in the following thread:
> http://www.nabble.com/Single-character-does-not-match-high-value-unicode-character-with-collation-TERRITORY_BASED.-Is-this-a-bug-tf4118767.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-2967) Single character does not match high value unicode character with collation TERRITORY_BASED

Reply via email to