Laura Stewart wrote:
As part of adding the new attribute collation=TERRITORY_BASED, I think
that we need to describe how Derby handles collation.
I am trying to get my head around the best way to describe collation
in Derby... for 10.3
In general terms, a collating sequence is a defined ordering for
character data that determines whether a particular character sorts
higher, lower, or the same as another character. Each character set
will also have a default collation.
In Derby, it is my understanding that all of our string data types are
represented as Unicode sequences. Is that correct?
Yes, and I think that fact should be documented, at least in the ref
guide section for each string data type, CHAR, VARCHAR, LONG VARCHAR and
CLOB. I think also specifically that it is Unicode 2.0 which is the
version supported by Java 1.1. I don't think any changes have been made
to support later versions of Unicode, and I don't know what changes, if
any, would be required.
Dan.