I have a couple questions about collations in Derby. As I understand it, Derby supports two collations: UCS_BASIC and TERRITORY_BASED.

By default, all string types have UCS_BASIC collation and the string columns in system catalogs always have UCS_BASIC collation regardless of whether user tables use a different collation.

If the database is created with the collation=TERRITORY_BASED attribute, then all string types created by users have TERRITORY_BASED collation. This includes the types of string columns in user tables, string args in user routines, and string constants used in queries.

I believe that the following two statements are true. I will explain why shortly. Please let me know if you think that I am missing something.

1) Although you can cast a UCS_BASIC string to a TERRITORY_BASED string, you can't perform the reverse cast.

2) In a database which was created with TERRITORY_BASED collation, that is the collation of the string return types and args of all system procedures and functions.

I believe (1) because I can't find any examples of how to perform the reverse cast. I don't see any examples in CollationTest and CollationTest2. The <COLLATION> and <COLLATE> tokens are not actually used anywhere in Derby's SQL grammar.

I believe (2) because that is the behavior of the sample procedures and functions which I ran: syscs_util.syscs_check_table, syscs_util.syscs_set_database_property, and syscs_util.syscs_get_database_property. I haven't systematically tested all of Derby's system routines, but it seems likely to me that they all behave the same way.

I think that this current behavior is reasonable. In particular, I don't think that we should use UCS_BASIC collation for the string return types and args of system routines. That would make the routines very awkward to use.

Am I tracking? Does this summary sound correct?

Thanks,
-Rick

Reply via email to