[ https://issues.apache.org/jira/browse/DERBY-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470603 ]
Mamta A. Satoor commented on DERBY-1478: ---------------------------------------- Rick, I looked at SQL specification(Part 2) regarding SQL identifiers. For background, some general information on SQL identifiers from SQL spec if as follows <Start of contents from SQL spec> 1)As per SQL specification Part 2, Section 4.2.4, the character repertoire for sql identifiers, SQL_IDENTIFIER, consists of <SQL language character> Latin characters and digits,and all the other characters that the SQL-implementation supports for use in <regular identifier>. After this, everything else related to SQL_IDENTIFER character repertoire is defined as implementation-defined. To be specific, 2)Section 4.2.5, Character encoding form, Pg 22 says SQL_IDENTIFIER is an implementation-defined character encoding form. It is applicable to the SQL_IDENTIFIER character repertoire. 3)Section 4.2.6, Collation, Pg 23, says SQL_IDENTIFIER is an implementation-defined collation. It is applicable to the SQL_IDENTIFIER character repertoire. 4)And lastly, in Section 4.2.7, Character Sets, SQL_IDENTIFIER is a character set whose repertoire is SQL_IDENTIFIER and whose character encoding form is SQL_IDENTIFIER. The name of its default collation is SQL_IDENTIFIER. 5)Section 4.2.3.1, Pg 19, talks about case folding. <fold> is a pair of funtions for converting all the lower case and title case characters in a given string to upper case or all the upper case and title case characters to lower case. A lower case character is a character in the Unicode General Category class "Ll" and upper case character is a character in the Unicode General Category class "Lu". <End of contents from SQL spec> >From the information above, we see that SQL specification leaves CEF and >collation for SQL identifiers as implementation-defined but I donot see it >saying specifically that case folding as implementation-defined. Even the >section 4.2.3.1, Pg 19, second paragraph, talks about converting case in a >generic manner in the context of UNICODE and not English locale. So, I am not sure why Derby/Cloudscape chose to use English locale to do case conversion of SQL identifiers. Derby's StringUtil class, where the SQL case conversion code lies, has following comment // The functions below are used for uppercasing SQL in a consistent manner. // Cloudscape will uppercase Turkish to the English locale to avoid i // uppercasing to an uppercase dotted i. In future versions, all // casing will be done in English. The result will be that we will get // only the 1:1 mappings in // http://www.unicode.org/Public/3.0-Update1/UnicodeData-3.0.1.txt // and avoid the 1:n mappings in //http://www.unicode.org/Public/3.0-Update1/SpecialCasing-3.txt // // Any SQL casing should use these functions Dan, you mentioned in one of your comments to this Jira entry that "Currently the uppercasing of SQL statements and identifiers is fixed as English to avoid unexpected issue with other languages". Can you please explaing what you mean by unexpected issues? Is that the same reason for recommending same behavior for system tables? > Add built in language based ordering and like processing to Derby > ----------------------------------------------------------------- > > Key: DERBY-1478 > URL: https://issues.apache.org/jira/browse/DERBY-1478 > Project: Derby > Issue Type: Improvement > Components: SQL > Affects Versions: 10.1.2.1 > Reporter: Kathey Marsden > Assigned To: Mamta A. Satoor > Attachments: DERBY-1478_FunctionalSpecV1.html > > > It would be good for Derby to have built in Language based ordering based on > locale specific Collator. > Language based ordering is an important feature for international deployment. > DERBY-533 offers one implementation option for this but according to the > discussion in that issue National Character Types carry a fair amount of > baggage with them especially in the form of concerns about conversion to > and from datetime and number types. Rick mentioned SQL language for > collations as an option for language based ordering. There may be other > options too, but I thought it worthwhile to add an issue for the high level > functional concern, so the best choice can be made for implementation without > assuming that National Character Types is the only solution. > For possible 10.1 workaround and examples see: > http://wiki.apache.org/db-derby/LanguageBasedOrdering -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.