Hi Dan,
You asked about how collation will be set for character expressions like
string literal, cast to character type of a character expression, trim,
concationation etc.
DTD will have an attribute called collation type and in 10.3, the possible
values for it will be -1 meaning UNKNOWN collation, 0 meaning UCS_BASIC and
1 meaning TERRITORY_BASED. By default, DTD's will have the collation type
set to UNKNOWN. If the DTD is for a user table's CHAR column, then DTD's
collation will be set to TERRIOTRY_BASED/UCS_BASIC depending on what was
requested at database create time in the jdbc url. This setting of collation
will be done by DTD.setCollationType(int). If the DTD is for a SYS schema
table's CHAR column, then DTD's collation will be set to UCS_BASIC.
I think there is a DTD associated with all the character expressions like
string literal, cast to character type of a character expression, trim,
concationation etc. And since the default collation type is UNKNOWN, these
character expressions will have their collation type as UNKNOWN until they
actually get used in a collation method. When they get used in a collation
method, their collation type will be determined by the context in which they
are. ie if the other operand of the collation method has UCS_BASIC
associated with them, then the character expression's collation type in DTD
will get set to UCS_BASIC and similar logic if the other operand had
TERRITORY_BASED collation type associated with it.
I hope this answers your question. I will include this information on the
wiki page for DERBY-1478 so that everything is tracked in one central
location.
thanks,
Mamta
On 3/18/07, Daniel John Debrunner <[EMAIL PROTECTED]> wrote:
Mike Matrigali wrote:
> I'll let someone else summarize. At this point I have
> been convinced by Dan that his proposal is the best way
> forward. And by rick and dan that we should just go
> ahead and store column level metadata for the collate
> info in the store, as well as in the language level
> per column metadata.
>
> The key points that convinced me are:
> o Even though we are proposing a "single" collation per
> database, internally we need to support 2 per database to
> do the right thing for system catalogs. Once there are
> 2 we needed support in store to at the very least store
> metadata per conglomerate.
>
> o It looks like dan's proposal makes the runtime creation
> of the collated and non-collated objects easier. I don't
> understand all the places this affects, but anything that
> makes this easier seems good to me.
I think some design specification or notes would be really useful for
collation. As Mike says the places where this has an impact are not well
known, starting a list on a wiki page would be good, then others could
look and ask if other areas are effected. E.g. I think the path we are
heading down is that at create table or alter table add column time the
collation for that column will be set in its DataTypeDescriptor, just
like its nullability is today. Then at bind time when that column is
referenced the collation type will be available through its DTD. But
there are a host of other character expressions, it would be good to
list these up front and how the collation will be set, rather than
discovering them one at time through coding (and missing some). E.g.
What's the defined behaviour for:
string literal
cast to character type of a character expression
trim
concatenation
etc.
Then some writeup of how store column collation information is to be
stored (along with upgrade issues) would really help cement a good
design up front.
Thanks,
Dan.