So... I wonder how many of you have hit this problem: You think
you're using utf8 for everywhere. Turns out your php <-> mysql
connection is using latin1. Everything just about works right from
php so it's hard to notice... you don't really see it until you
correctly add "set names utf8" to the top of your execution or work
with the data directly in the mysql client. The data in the database
is incorrect and it's not looking like there's a way to fix it right
in mysql. Converting to blob or binary and back to utf8 character
set doesn't work. Perhaps a dump of just the data and then some type
of iconv command. I haven't had any luck so far with that.
This article is fairly relevant: http://www.oreillynet.com/onlamp/
blog/2006/01/turning_mysql_data_in_latin1_t.html
Basically here's the problem:
Björk is being stored as Björk
Or in hex:
What should be 426AC3B6726B is actually 426AC383C2B6726B
If I do "set names latin1" it looks like Björk but if I do "set names
utf8" it looks like Björk
The database, columns, client, etc are all defined as utf8 - it's
just that the mysql extension doesn't read the default character set
from the my.cnf and requires a "set names utf8" in order for the data
to actually be transmitted as utf8. I know mysqli lets you specify
the character set.
As noted by one of the commenters in the above article, it seems that
any C383C2__ sequence needs to be converted to C2B6. However, I'm
not sure if that's all that needs to be replaced. I can't exactly
test every character. But that could be a start... I'm thinking a
dump of the data, a sed script, and then a reimport would do the
trick. I imagine what happened here is that each byte of the multi-
byte unicode characters
I'm using MySQL 5.0.45.
Let me know if you know how to fix this.
Thanks!!
-Rob_______________________________________________
New York PHP Community MySQL SIG
http://lists.nyphp.org/mailman/listinfo/mysql
NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com
Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php