Chris Meller wrote: > The most recent MySQL encoding thread here brought up the question of > why we use utf8_unicode_ci instead of utf8_general_ci like everyone > else. It's yet to be "resolved", but I thought I'd throw the question > back out anyway for my own edification. I checked back and found when > the original UTF8 changes were merged in, but there was no discussion as > to which character set to use that I could find. > > Can someone with more knowledge of the nuances between MySQL character > sets tell me what the differences are between utf8_general and > utf8_unicode? I also recall there having been mention somewhere of > utf8_bin being preferred to prevent problems with the case-insensitivity > of _ci types, but I don't know why that would be an issue in other > languages and not English...
utf8_general_ci is a little faster because it uses a very simple collation algorithm. This simple and fast algorithm is also usually wrong. utf8_unicode_ci uses the standard Unicode collation table. This post on the MySQL forums sums it up pretty well: http://forums.mysql.com/read.php?103,187048,188748#msg-188748 -Matt --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/habari-dev -~----------~----~----~----~------~----~------~--~---
