Hello list,

I'm writing a small utility in PHP to archive email messages in MySQL, so that I can search through them with full-text indexing. In order to handle all the various charsets, I was simply converting all text to utf8 (using mb_convert_encoding()) before storing it in the database.

I hadn't even considered the charset issue in the database itself until I was looking through the MySQL online manual for something else and ran across the Unicode chapter.

I did a little experimenting with my current db (MySQL 4.0.14 on Red Hat 9, character_set=latin1). It will allow me to insert records containing unicode characters such as Ф (cyrillic capital letter EF) into varchar and text fields, and select them back out with no problem.

I have a couple of questions about this behavior.
1) If I continue to do this, is it possible that MySQL could lose some wacky characters?
2) Do non-latin1 characters muck with searching or sorting at all? Hopefully, they are just ignored...
3) Will there be any issues with my tables when upgrading to MySQL 4.1? Ideally, searching and sorting would just pick up the unicode characters, as I'll set my character_set=utf8.

Maybe there are some other issues I'm not aware of as well... any insight would be appreciated.

TIA,
Ben


-- MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to