Re: UTF-24

David Starner Thu, 03 Apr 2003 12:35:25 -0800

On Thu, Apr 03, 2003 at 09:05:23PM +0200, Pim Blokland wrote:
> Why is there no UTF-24?


Why? UTF-24 will almost invariably be larger then UTF-16, unless you are
talking a document in Old Italic or Gothic. The math alphanumberic
characters will almost always be combined with enough ASCII to make
UTF-8 a win, and if not, enough BMP characters to make UTF-16 a win.
Modern computers don't deal with 24 bit chunks well; in memory, they'd
take up 32 bits a piece, unless you declared them packed, and then
they'd be a lot slower then UTF-16 or UTF-32. And if you're storing to
disk, you may as well use BOCU or SCSU (you're already going
non-standard), or use standard compression with UTF-8, UTF-16, BOCU or
SCSU. SCSU or BOCU compressed should take up half the space of UTF-24,
if that.

-- 
David Starner - [EMAIL PROTECTED]
It's the terror of knowing/What this world is about
Watching some good friends/Screaming 'Let me out'
   -- Queen, "Under Pressure"

Re: UTF-24

Reply via email to