Jim Starkey wrote:
Jay Pipes wrote:
Regarding collations, Brian and I just chatted about this.  Currently,
MySQL only supports a single byte to indicate the collation, which means
that only 256 collations are supported by MySQL.  This is a problem,
since they've already run out of identifiers.

Brian thinks we can chop the number of supported collations down
significantly in drizzle, because many of them are charset-specific, and
can re-start the ordering from 0, meaning that the ABI would not need to
change.  This is important as heikki has expressed thoughts that he's
not willing to update InnoDB to support a 2-byte collation identifier at
this point.

Jim, does this answer your question or were you looking for a different
answer?

Not really. I've never quite understood how collations are supposed to work in MySQL. Is a collation of a property of a session? Of a statement? Of data? And what does the number 127 have to do with the zillion or so collations defined in the world?

Probably better than having only the collation "binary" :)

In HADB, we implemented collations as an attribute of the charset. This may not be the perfect solution (a collation should probably be independent of charsets?), but from a practical viewpoint you might eg use a homegrown collation implementation for the ASCII charset and the ICU implementation for UTF-8. When depending on different external implementations it was difficult to guarantee similar behaviour of one collation when dealing with different character sets.


What is the interaction between and index's collation and the session? Can range retrievals use an index if the session collation does match the index collation?

That is a good question... The reason for having an index in a specific collation is probably that it matches the collation of the locale of the users. Thus, if the user's locale's collation is the same as the collation of the index, everything is fine. The tradeoff occurs when a user with a different locale/collation performs a query. Generally you cannot guarantee that the range covered by the index is the same as the range the user expects, hence the index cannot be used. I guess this is a tradeoff that should be decided by the application programmer.

The question behind the question is how should Nimbus handle collations. I've got a cleaner piece of paper than you do. Have any advice?


_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to