I have a couple questions about collations in Derby. As I understand it,
Derby supports two collations: UCS_BASIC and TERRITORY_BASED.
By default, all string types have UCS_BASIC collation and the string
columns in system catalogs always have UCS_BASIC collation regardless of
whether user tables use a different collation.
If the database is created with the collation=TERRITORY_BASED attribute,
then all string types created by users have TERRITORY_BASED collation.
This includes the types of string columns in user tables, string args in
user routines, and string constants used in queries.
I believe that the following two statements are true. I will explain why
shortly. Please let me know if you think that I am missing something.
1) Although you can cast a UCS_BASIC string to a TERRITORY_BASED string,
you can't perform the reverse cast.
2) In a database which was created with TERRITORY_BASED collation, that
is the collation of the string return types and args of all system
procedures and functions.
I believe (1) because I can't find any examples of how to perform the
reverse cast. I don't see any examples in CollationTest and
CollationTest2. The <COLLATION> and <COLLATE> tokens are not actually
used anywhere in Derby's SQL grammar.
I believe (2) because that is the behavior of the sample procedures and
functions which I ran: syscs_util.syscs_check_table,
syscs_util.syscs_set_database_property, and
syscs_util.syscs_get_database_property. I haven't systematically tested
all of Derby's system routines, but it seems likely to me that they all
behave the same way.
I think that this current behavior is reasonable. In particular, I don't
think that we should use UCS_BASIC collation for the string return types
and args of system routines. That would make the routines very awkward
to use.
Am I tracking? Does this summary sound correct?
Thanks,
-Rick