On Jun 7, 2010, at 11:44 AM, Warren Young wrote:

> On 6/7/2010 9:57 AM, Ryan Chan wrote:
>> http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html
>> 
>> Since MySQL only support BMP, so in fact 16 bit is needed actually?
> 
> I imagine they were thinking they'd extend the support to full Unicode in the 
> future and didn't want you to have to dump and reload your databases when 
> that happened.  The Unicode consortium has stated that Unicode will never 
> require more than 21 bits per character[*], and 24 bits is the next even 
> multiple of 8 up from that.
> 
> [*] Why 21?  Because that's the maximum number of bits you can express in 4 
> bytes with UTF-8 encoding.  If Unicode were allowed to use all 2^32 code 
> points as originally envisioned, it would require up to 6 bytes per character 
> in UTF-8 encoding.  This promise makes UTF-8 code easier to write and easier 
> to future-proof without bad performance penalties.


Supplemental Unicode characters (4-byte) are supported as of MySQL 5.5.3:

http://dev.mysql.com/doc/refman/5.5/en/charset-unicode.html
http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html

-- 
Paul DuBois
Oracle Corporation / MySQL Documentation Team
Madison, Wisconsin, USA
www.mysql.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to