>That's the crux of the matter right there. What the >target.getBytes(fromCharset) does is ask the original "target" Unicode >String (presumably containing % escapes) to convert itself to its byte >representation in the original charset. Then "new String(..., >toCharset) creates a new Unicode string while pretending those very same >bytes we just created are in "toCharset", which is presumably a >different charset. Any Unicode characters that have different encodings >in those two character sets will end up changing in the second string, >because the bytes will be written into the byte array using one >character set, and then interpreted using another character set. And >since some character set encodings are stateful, it's conceivable that >you could even have "fromCharset" and "toCharset" values that caused the >new String construction to blow up because the byte array was invalid >for the toCharset decoder. > >The part I'm having trouble with is *why* you'd want to do this. The >whole point of Unicode (or one of them) is so that you don't have to >remember what charset your byte arrays are in. Once you convert from a >String to a byte array, you need to preserve the charset of that byte >array. Suddenly pretending it's in a different charset is just going to >screw things up. >
Laura I really appreciate your response. I can't say I comletely agree with your point (or understand it), but so be it. Had not Sung-Su refused to provide a simple unit test case for this method, this discussion would have been put to an end a few months ago. But apparently writing test cases is for losers Kind regards Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
