Thanks for your quick response. I have set the meta tag in the header field to <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
I am submitting Thai text through a form, it is stored in a mysql database, the database charset and collation appear to be correct. When retrieving the text, the thai text is displayed correctly. The problem is that the thai text gets converted to html entities, which means that there is something amiss here. Even though the CMS is a web application and html entities is fine for browser display, I would like to follow the correct procedure so that the database text can even be read in the correct language. The strange behaviour is when testing on my local machine, the browser (FireFox 1.5) shows the Encoding as "ISO-8859-1" and content type as utf-8. I changed the default encoding type to UTF-8 for the browser under the content->advanced ->options but this doesn't not appear to have affected anything. It could be that the encoding for the server needs to be set to utf-8 or that I need to send headers to that effect. I am running apache on windows (xampp) locally. Using phpmyadmin as the database gui. Strangely enough the page encoding for the phpmyadmin pages shows up as UTF-8. I also found a very good resource on various php/mysql/browser issues http://www.phpwact.org/php/i18n/charsets?s=utf8 It gets even stranger when I set the form encoding to enctype="multipart/form-data" (which is the recommended setting for submitting unicode characters). In this case, the thai text is neither stored correctly nor displayed correctly, but are changed into accented symbols. Yet, the very same accented characters are both stored and displayed, there's no loss there. -----Original Message----- From: Tex Texin [mailto:[EMAIL PROTECTED] Sent: Monday, January 30, 2006 5:41 PM To: 'Naintara'; php-i18n@lists.php.net Subject: RE: [PHP-I18N] storing unicode in mysql database When you say the browser charset is set to utf-8, how are you doing that? If the browser is converting the thai characters to numeric character references (I assume that is what you mean by html codes: e.g. &#ddddd; where d is a decimal digit) then most likely the page that is accepting user data is not set to the right encoding. Make sure the http protocol specifies the encoding is utf-8 and not iso 8859-1, or if the http protocol is not setting charset, then make sure the web page is setting it to utf-8 (<meta http-equiv=Content-Type content="text/html; charset=UTF-8">). Tex Texin Internationalization Architect, Yahoo! Inc. > -----Original Message----- > From: Naintara [mailto:[EMAIL PROTECTED] > Sent: Monday, January 30, 2006 3:32 AM > To: php-i18n@lists.php.net > Subject: [PHP-I18N] storing unicode in mysql database > > > > Hi, > > I'd like to know what would be the best way to store Unicode > text in a database. I am using MySQL 4.1. I am trying to > create a multi-lingual CMS and the browser charset is set to > utf-8 and the database and tables are set to UTF8 and > utf8_bin for charset and collation. > > While displaying in the browser, Thai text is displayed > correctly but it is stored as html code in the database. Is > this correct behaviour or is there a better way? Would I need > to specify charset for every query, or is it enough to have > specified it for the mysql connection, results and client > charset options. > > -- > No virus found in this outgoing message. > Checked by AVG Free Edition. > Version: 7.1.375 / Virus Database: 267.14.23/243 - Release > Date: 27-Jan-06 > > > -- > PHP Unicode & I18N Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.375 / Virus Database: 267.14.23/243 - Release Date: 27-Jan-06 -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.375 / Virus Database: 267.14.23/243 - Release Date: 27-Jan-06 -- PHP Unicode & I18N Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php