Jim Starkey wrote:
Jay Pipes wrote:
Regarding collations, Brian and I just chatted about this. Currently,
MySQL only supports a single byte to indicate the collation, which means
that only 256 collations are supported by MySQL. This is a problem,
since they've already run out of identifiers.
Brian thinks we can chop the number of supported collations down
significantly in drizzle, because many of them are charset-specific, and
can re-start the ordering from 0, meaning that the ABI would not need to
change. This is important as heikki has expressed thoughts that he's
not willing to update InnoDB to support a 2-byte collation identifier at
this point.
Jim, does this answer your question or were you looking for a different
answer?
Not really. I've never quite understood how collations are supposed to
work in MySQL. Is a collation of a property of a session? Of a
statement? Of data? And what does the number 127 have to do with the
zillion or so collations defined in the world?
Probably better than having only the collation "binary" :)
In HADB, we implemented collations as an attribute of the charset. This
may not be the perfect solution (a collation should probably be
independent of charsets?), but from a practical viewpoint you might eg
use a homegrown collation implementation for the ASCII charset and the
ICU implementation for UTF-8. When depending on different external
implementations it was difficult to guarantee similar behaviour of one
collation when dealing with different character sets.
What is the interaction between and index's collation and the session?
Can range retrievals use an index if the session collation does match
the index collation?
That is a good question... The reason for having an index in a specific
collation is probably that it matches the collation of the locale of the
users. Thus, if the user's locale's collation is the same as the
collation of the index, everything is fine. The tradeoff occurs when a
user with a different locale/collation performs a query. Generally you
cannot guarantee that the range covered by the index is the same as the
range the user expects, hence the index cannot be used. I guess this is
a tradeoff that should be decided by the application programmer.
The question behind the question is how should Nimbus handle
collations. I've got a cleaner piece of paper than you do. Have any
advice?
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp