Hi Kathey, I debugged the code below and it looks like _ not matching \uFA2D might be a bug. The actual code for comparison happens in the existing code that was left over for National character types. In SQLChar and in the newly introduced classes for collation, there are two methods
public BooleanDataValue like(DataValueDescriptor pattern) public BooleanDataValue like(DataValueDescriptor pattern,DataValueDescriptor escape) throws StandardException In SQLChar, we check if we are dealing with national character types and if so, we do special code for it's like implementation. The same special code gets used for collation related classes like CollatorSQLChar. The special processing involves getting the collation elements using the RuleBasedCollator for the character string. The collation elements for a string are obtained using RuleBasedCollator.getCollationElementIterator( characterString.getString()). Taking specific example of Norwegian, '\uFA2D' converts into 2 (and not 1 and this is the cause of the problem) collation elements. These collation elements are passed as in int array to following method in iapi.types.Like class public static Boolean like(int[] value, int valueLength, int[] pattern, int patternLength, RuleBasedCollator collator) The method above uses the passed RuleBasedCollator to find the collation element for '_'. For our specific example, in Norwegian, '_' translates into only one collation element (vs 2 elements for '\uFA2D'). When looking for '_', we eliminate only 1 collation element from the array created for '\uFA2D' because '_' got translated into 1 collation element. Following is the code copied from Like.like else if (matchSpecial(pat, pLoc, pEnd, anyCharInts)) { // regardless of the char, it matches vLoc += anyCharInts.length; pLoc += anyCharInts.length; result = checkLengths(vLoc, vEnd, pLoc, pat, pEnd, anyStringInts); if (result != null) return result; } So, it seems that the code above can't assume that the collation elements for all the characters in say Norwegian are 1 in length just because collation element for '_' is 1 element. I think we should go ahead and open a jira entry for this. Would like to hear if anyone has any comments on this. thanks, Mamta On 7/20/07, Kathey Marsden <[EMAIL PROTECTED]> wrote:
With TERRITORY_BASED collation '_' does not match the character \uFA2D. It is the same for english or norwegian. FOR collation UCS_BASIC it matches fine. Could you tell me if this is a bug? Here is a program to reproduce. Kathey import java.sql.*; public class HighCharacter { public static void main(String args[]) throws Exception { System.out.println("\n Territory no_NO"); Class.forName("org.apache.derby.jdbc.EmbeddedDriver"); Connection conn = DriverManager.getConnection("jdbc:derby:nordb;create=true;territory=no_NO;collation=TERRITORY_BASED"); testLikeWithHighestValidCharacter(conn); conn.close(); System.out.println("\n Territory en_US"); conn = DriverManager.getConnection("jdbc:derby:endb;create=true;territory=en_US;collation=TERRITORY_BASED"); testLikeWithHighestValidCharacter(conn); conn.close(); System.out.println("\n Collation USC_BASIC"); conn = DriverManager.getConnection("jdbc:derby:basicdb;create=true"); testLikeWithHighestValidCharacter(conn); } public static void testLikeWithHighestValidCharacter(Connection conn) throws SQLException { Statement stmt = conn.createStatement(); try { stmt.executeUpdate("drop table t1"); }catch (SQLException se) {// drop failure ok. } stmt.executeUpdate("create table t1(c11 int)"); stmt.executeUpdate("insert into t1 values 1"); // \uFA2D - the highest valid character according to // Character.isDefined() of JDK 1.4; PreparedStatement ps = conn.prepareStatement("select 1 from t1 where '\uFA2D' like ?"); String[] match = { "%", "_", "\uFA2D" }; for (int i = 0; i < match.length; i++) { System.out.println("select 1 from t1 where '\\uFA2D' like " + match[i]); ps.setString(1, match[i]); ResultSet rs = ps.executeQuery (); if( rs.next() && rs.getString(1).equals("1")) System.out.println("PASS"); else System.out.println("FAIL: no match"); rs.close(); } } }