On Wednesday 05 September 2007 16:55, Jeff Rogers wrote: > I haven't dealt with chinese characters at all, but this sounds like > you're doing character set translations, not character encoding > conversions. tcl's 'encoding' command won't help you here - you'd need > a monster "string map" command to change all 6000? code points from one > into the other. To draw a much simplified analogy, this is like > translating cp1252 to iso8859-1 - you can't do it by simply changing the > encoding, you must translate the character set from one to the other by > mapping the characters that do not appear in the target character set > (in the case of cp1252->iso8859-1 you might map both the left and right > single quotes to an apostrophe)
This is what I was thinking. Simplifying a character set isn't 'simple'. And it would seem impossible to go from the simple character set to the complex one. It isn't quite a translation, which would be impossible, but the map will likely have one entry for every char in the larger set, whereas you can use an algorithm to convert UTF-16 to UTF-8. The key is the map. If this was built into Tcl, or you could put it into Tcl, you could dispense with ipc, java and files. tom jackson -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
