support for 32-bit Unicode

Tomas Frydrych Mon, 04 Feb 2002 01:47:42 -0800


> What we need to do is support the full 32-bit Unicode
> character set but we shouldn't use UTF-32 to do it
> since we'll waste vast amounts of memory space since
> characters above 16-bit are very very rare.  We need
> to instead switch to UTF-8 internally for everything.
> This is the right answer for several reasons which
> have all been covered in depth on several mailing
> lists
Since the characters have a variable bit-widthutf, utf-8 processing is 
very cpu intensive for everything but the basic 7-bit ascii charset. It 
is not meant to be used interanlly by applications, it is meant as 
an encoding for communication between applications over 8-bit 
chanells. Internally we need to use a fixed-width encoding, so if we 
want to support 32-bit Unicode, we have to redefine UT_UCSChar 
to long.


I agree that having 32 UT_UCSChar would vaste lot of memory, and 
I would like to see a case made first why we need to support 32-bit 
Unicode.

Tomas

support for 32-bit Unicode

Reply via email to