1. Java being designed 10 years ago has nothing to do with the fact that
Unicode is represented as hex internally.
2. Some characters in DBCS charsets cannot be rendered via UTF-8. period. 

Look it up if you don't believe me:
http://en.wikipedia.org/wiki/UTF-8


--Phillip 

 

-----Original Message-----
From: Sixten Otto [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 17, 2006 3:18 PM
To: CF-Talk
Subject: Re: Convert UTF-8 to ISO-8859-1 under CF5

Phillip Holmes wrote:
>If you're using a character set that uses 7 bits for US ascii and the 
>8th for special characters, that's fine. But, if you're using a charset 
>that utilizes the second byte, UTF8 will not be a suffiencent encoding 
>and you'll have garbling on some characters.

With all due respect, I think you're very much mistaken. UTF-8 (and the
other UTF encodings) were *designed* to encode Unicode. That's what the U
stands for. It wouldn't be much use to anyone if it could only handle 8-bit
character sets (which is *not* what the 8 in the name means).

See, for example:
http://en.wikipedia.org/wiki/Utf-8
http://tools.ietf.org/html/rfc3629

You're correct that Java uses exactly 16 bits to store Unicode chars, but
that's partly because Java was designed (10 years ago) before the size of
Unicode exceeded two bytes worth of code points.

You're also correct that if you took a bunch of bytes that were really
Shift-JIS or some other DBCS, and tried to use them as though they were a
sequence of UTF-8 bytes, bad things would happen. But that has nothing to do
with what UTF-8 can or can't do.

I'll note, further, that in your hypothetical example of a full 8-bit
character set (Latin-1, say, or MacRoman), if you were to try and encode
that with UTF-8, many of the encoded characters would require two bytes,
since the most significant bit is actually used by UTF-8 itself. ASCII is a
7-bit character set, which is how those characters can be identical encoded
in UTF-8.

Sixten



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Introducing the Fusion Authority Quarterly Update. 80 pages of hard-hitting,
up-to-date ColdFusion information by your peers, delivered to your door four 
times a year.
http://www.fusionauthority.com/quarterly

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:260962
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to