I was asked to explain the "what's that on your screen? .̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̸̨̨̨̨̨̨̨̨̨̨̨̨.̸̸̨̨" Twitter meme that originated with http://twitter.com/dailydylann/statuses/63228871759237120. When I tried to insert the Unicode character combination into my post, however, Habari failed on me and returned a bunch of question marks in place of the UTF-8 I put in.
I tracked the problem down to the MySQL Connection Adapter: It only calls 'SET CHARACTER SET UTF8' (or whatever your MYSQL_CHAR_SET is set to; default is UTF8), but not 'SET NAMES UTF8'. According to a source code comment, "SET CHARACTER SET covers all the values included in SET NAMES, as per http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html". However, that is not true, as outlined (albeit not very clearly) on the very page linked above: == Quote from the page == A `SET NAMES 'x'` statement is equivalent to these three statements: SET character_set_client = x; SET character_set_results = x; SET character_set_connection = x; [...] A `SET CHARACTER SET x` statement is equivalent to these three statements: SET character_set_client = x; SET character_set_results = x; SET collation_connection = @@collation_database; Setting collation_connection also sets character_set_connection to the character set associated with the collation (equivalent to executing SET character_set_connection = @@character_set_database). It is not necessary to set character_set_connection explicitly. == End quote == The important difference is that SET CHARACTER SET sets collation_connection to the collation of the selected database, and then in turn uses that value to set character_set_connection. Thus, the two statements are equivalent ONLY if the collation of the database Habari uses is utf8_*. If, however, your database collation is, for example, the old MySQL default of latin1_swedish_ci, then: SET NAMES UTF8; will set your character_set_connection to UTF8, whereas SET CHARACTER SET UTF8; will set your character_set_connection to latin1! I thus amended system/schema/mysql/connection.php with: $this->exec('SET NAMES ' . MYSQL_CHAR_SET); after the "SET CHARACTER SET" line and removed the misleading comment. I hope this fixes the UTF-8 issues once and for all. Regards, Matt -- To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/habari-dev
