Character set introducers are used when telling MySQL to consider a
string literal using a specific character set.  They are used as follows:

SELECT _latin1'Kaj Arnö';

where _latin1 is the charset introducer.

Question 1:

Can introducers go away now that all charsets are 4-byte UTF8?

Question 2:

Related to Brian's question about CONVERT and CAST, can we get rid of
charset conversions entirely?  For instance, here is a statement from
the now-disabled ctype_utf8 test case:

select CONVERT(_latin1'Günter André' using utf8) LIKE
CONVERT(_latin1'GÜNTER%' USING utf8);

I'm actually not sure what this is testing at all.  I mean, "USING utf8"
designates the character set, not the collation, but the LIKE expression
depends on the collation, not the character set. :(

In short, there is a ton of work to be done to actually verify proper
UTF8 transcoding and collation.

Question 3:

Who in our community is an expert on multi-byte character sets,
specifically UTF8, and will be able to write unit tests which verify
input and output of Drizzle's UTF8 collations?

Thanks,

jay

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to