On Wednesday 05 September 2007 16:55, Jeff Rogers wrote:
> I haven't dealt with chinese characters at all, but this sounds like
> you're doing character set translations, not character encoding
> conversions.  tcl's 'encoding' command won't help you here - you'd need
> a monster "string map" command to change all 6000? code points from one
> into the other.  To draw a much simplified analogy, this is like
> translating cp1252 to iso8859-1 - you can't do it by simply changing the
> encoding, you must translate the character set from one to the other by
> mapping the characters that do not appear in the target character set
> (in the case of cp1252->iso8859-1 you might map both the left and right
> single quotes to an apostrophe)

This is what I was thinking. Simplifying a character set isn't 'simple'. And 
it would seem impossible to go from the simple character set to the complex 
one.  It isn't quite a translation, which would be impossible, but the map 
will likely have one entry for every char in the larger set, whereas you can 
use an algorithm to convert UTF-16 to UTF-8.  The key is the map. If this was 
built into Tcl, or you could put it into Tcl,  you could dispense with ipc, 
java and files.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to