[ https://issues.apache.org/jira/browse/DERBY-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518262 ]
Daniel John Debrunner commented on DERBY-2967: ---------------------------------------------- I would say neither (existing or proposed) is correct, though only after looking at your example. The existing code is wrong because it is converting the '_' to a collation element then skipping that number of collation elements? The proposed results are wrong because in the French locale agraveCombined ="A\u0300" is a single character. I assume the \u0300 is a 'Combining character'. http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf#G30602 Thus I think the results should be: 0 rows matching SELECT COUNT(*) FROM T WHERE VC LIKE A_ 2 rows matching SELECT COUNT(*) FROM T WHERE VC LIKE _ (agrave,agraveCombined) 0 rows matching SELECT COUNT(*) FROM T WHERE VC LIKE __ As for implementing it, I think one has to use the getOffset/setOffset method on CollationElementIterator. E.g. along the lines of this to skip a character. The real solution would be more than this but you get the idea. if (patternChar == '_') iterator.setOffset(iterator.getOffset() + 1); > Single character does not match high value unicode character with collation > TERRITORY_BASED > ------------------------------------------------------------------------------------------- > > Key: DERBY-2967 > URL: https://issues.apache.org/jira/browse/DERBY-2967 > Project: Derby > Issue Type: Bug > Components: SQL > Affects Versions: 10.4.0.0 > Reporter: Kathey Marsden > Assignee: Kathey Marsden > Attachments: TestFrench.java > > > With TERRITORY_BASED collation '_' does not match the character \uFA2D. It > is the same for english or norwegian. FOR collation UCS_BASIC it matches > fine. Could you tell me if this is a bug? > Here is a program to reproduce. > import java.sql.*; > public class HighCharacter { > public static void main(String args[]) throws Exception > { > System.out.println("\n Territory no_NO"); > Class.forName("org.apache.derby.jdbc.EmbeddedDriver"); > Connection conn = > DriverManager.getConnection("jdbc:derby:nordb;create=true;territory=no_NO;collation=TERRITORY_BASED"); > testLikeWithHighestValidCharacter(conn); > conn.close(); > System.out.println("\n Territory en_US"); > conn = > DriverManager.getConnection("jdbc:derby:endb;create=true;territory=en_US;collation=TERRITORY_BASED"); > testLikeWithHighestValidCharacter(conn); > conn.close(); > System.out.println("\n Collation USC_BASIC"); > conn = DriverManager.getConnection("jdbc:derby:basicdb;create=true"); > testLikeWithHighestValidCharacter(conn); > } > public static void testLikeWithHighestValidCharacter(Connection conn) throws > SQLException { > Statement stmt = conn.createStatement(); > try { > stmt.executeUpdate("drop table t1"); > }catch (SQLException se) > {// drop failure ok. > } > stmt.executeUpdate("create table t1(c11 int)"); > stmt.executeUpdate("insert into t1 values 1"); > > // \uFA2D - the highest valid character according to > // Character.isDefined() of JDK 1.4; > PreparedStatement ps = > conn.prepareStatement("select 1 from t1 where '\uFA2D' like ?"); > String[] match = { "%", "_", "\uFA2D" }; > for (int i = 0; i < match.length; i++) { > System.out.println("select 1 from t1 where '\\uFA2D' like " + match[i]); > ps.setString(1, match[i]); > ResultSet rs = ps.executeQuery(); > if( rs.next() && rs.getString(1).equals("1")) > System.out.println("PASS"); > else System.out.println("FAIL: no match"); > rs.close(); > } > } > } > Mamta made some comments on this issue in the following thread: > http://www.nabble.com/Single-character-does-not-match-high-value-unicode-character-with-collation-TERRITORY_BASED.-Is-this-a-bug-tf4118767.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.