Re: [Drizzle-discuss] UTF8, charset introducers

Brian Aker Thu, 11 Sep 2008 10:15:51 -0700

Hi!

On Sep 11, 2008, at 10:00 AM, Jay Pipes wrote:

Character set introducers are used when telling MySQL to consider a

string literal using a specific character set. They are used asfollows:


SELECT _latin1'Kaj Arnö';

where _latin1 is the charset introducer.

Question 1:

Can introducers go away now that all charsets are 4-byte UTF8?

Yes, and they need to. We only do UTF-8. All comparisons/strings aredone in it. Collations for indexes are the only things you have achoice on.

This is not 100% true in the code right now, but it gets there withevery push. There are hardcoded latin comparisons that I am having toremove.

Question 2:

Related to Brian's question about CONVERT and CAST, can we get rid of
charset conversions entirely?  For instance, here is a statement from

Bigger question in my mind, can we get rid of the two functionscompletely? Both are implemented in the parser and not as actualfunctions. To modularize the parser these types of functions have togo away (there are very few of them left).

Who in our community is an expert on multi-byte character sets,
specifically UTF8, and will be able to write unit tests which verify
input and output of Drizzle's UTF8 collations?



My hope was to get Toru to write us some tests in Japanese :)

We need tests though that are specific to all of the collations wehave right now. Without a test, I do not trust the collation.


Cheers,
        -Brian

--
_______________________________________________________
Brian "Krow" Aker, brian at tangent.org
Seattle, Washington
http://krow.net/                     <-- Me
http://tangent.org/                <-- Software
_______________________________________________________
You can't grep a dead tree.




_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] UTF8, charset introducers

Reply via email to