So... I wonder how many of you have hit this problem: You think you're using utf8 for everywhere. Turns out your php <-> mysql connection is using latin1. Everything just about works right from php so it's hard to notice... you don't really see it until you correctly add "set names utf8" to the top of your execution or work with the data directly in the mysql client. The data in the database is incorrect and it's not looking like there's a way to fix it right in mysql. Converting to blob or binary and back to utf8 character set doesn't work. Perhaps a dump of just the data and then some type of iconv command. I haven't had any luck so far with that.

This article is fairly relevant: http://www.oreillynet.com/onlamp/ blog/2006/01/turning_mysql_data_in_latin1_t.html

Basically here's the problem:
Björk is being stored as Björk

Or in hex:
What should be 426AC3B6726B is actually 426AC383C2B6726B

If I do "set names latin1" it looks like Björk but if I do "set names utf8" it looks like Björk

The database, columns, client, etc are all defined as utf8 - it's just that the mysql extension doesn't read the default character set from the my.cnf and requires a "set names utf8" in order for the data to actually be transmitted as utf8. I know mysqli lets you specify the character set.

As noted by one of the commenters in the above article, it seems that any C383C2__ sequence needs to be converted to C2B6. However, I'm not sure if that's all that needs to be replaced. I can't exactly test every character. But that could be a start... I'm thinking a dump of the data, a sed script, and then a reimport would do the trick. I imagine what happened here is that each byte of the multi- byte unicode characters

I'm using MySQL 5.0.45.

Let me know if you know how to fix this.
Thanks!!
-Rob_______________________________________________
New York PHP Community MySQL SIG
http://lists.nyphp.org/mailman/listinfo/mysql

NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com

Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php

Reply via email to