Hi!
On Sep 11, 2008, at 10:00 AM, Jay Pipes wrote:
Character set introducers are used when telling MySQL to consider a
string literal using a specific character set. They are used as
follows:
SELECT _latin1'Kaj Arnö';
where _latin1 is the charset introducer.
Question 1:
Can introducers go away now that all charsets are 4-byte UTF8?
Yes, and they need to. We only do UTF-8. All comparisons/strings are
done in it. Collations for indexes are the only things you have a
choice on.
This is not 100% true in the code right now, but it gets there with
every push. There are hardcoded latin comparisons that I am having to
remove.
Question 2:
Related to Brian's question about CONVERT and CAST, can we get rid of
charset conversions entirely? For instance, here is a statement from
Bigger question in my mind, can we get rid of the two functions
completely? Both are implemented in the parser and not as actual
functions. To modularize the parser these types of functions have to
go away (there are very few of them left).
Who in our community is an expert on multi-byte character sets,
specifically UTF8, and will be able to write unit tests which verify
input and output of Drizzle's UTF8 collations?
My hope was to get Toru to write us some tests in Japanese :)
We need tests though that are specific to all of the collations we
have right now. Without a test, I do not trust the collation.
Cheers,
-Brian
--
_______________________________________________________
Brian "Krow" Aker, brian at tangent.org
Seattle, Washington
http://krow.net/ <-- Me
http://tangent.org/ <-- Software
_______________________________________________________
You can't grep a dead tree.
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp