Sung-Gu You are right. The examples I presented are meaningless. They are meaningless, because URIUtil.toUsingCharset method is meaningless in the very first place. I did my best to explain why
Again, please give me an example (or better a unit test) demonstrating a meaningful transformation of one Unicode string into another Unicode string using the method in question Oleg -----Original Message----- From: Sung-Gu [mailto:[EMAIL PROTECTED]] Sent: Montag, 27. Januar 2003 06:01 To: Commons HttpClient Project Subject: Re: The use of UTIUtil.toUsingCharset? Hi, I'm sorry that I wasn't reaching your point... You're interested in only single-byte encodings with Unicode. I haven't realized it... That's why you haven't seen the correct use and display of that method. I guessed so though. (So, I tried to display byte code values) And I'd like to comment you that your below examples're not correct to use... They're meaning-less... For display (what you want I guess), you should use code set or char set supported by your operating system or ISO-8859-1. For UTF-8 is capable to use only by purposes of transformation for storage and transmission. The case you want to use Unicode for display, ISO-10464 is fully supported and transformation to UTF-8 should be applied from UCS.... I made it as TODO comment for simple diagram 2 in text file. It was not my right previous issue. (As you know, I'm intersted in double-byte encodings... and it would be general way to solve character encoding) I'll do it sometime later... Sung-Gu ----- Original Message ----- From: <[EMAIL PROTECTED]> Subject: Re: The use of UTIUtil.toUsingCharset? Please take no offense, but URIUtil.toUsingCharset method still does not make even slightest sense to me. Your example shows how to invoke this method but does not explain what it is useful for, apart from garbling unicode strings Have a look at a simpler example. Here I attempt to (supposedly) convert "Z�rich" from one encoding into another. However, as you can see URIUtil.toUsingCharset() always produces garbage =================================================================== public static void main(String[] args) throws Exception { System.out.println( URIUtil.toUsingCharset("Z�rich", "UTF-8", "US-ASCII")); System.out.println( URIUtil.toUsingCharset("Z�rich", "ASCII", "UTF-8")); System.out.println( URIUtil.toUsingCharset("Z�rich", "UTF-8", "ISO-8859-1")); System.out.println( URIUtil.toUsingCharset("Z�rich", "ISO-8859-1", "UTF-8")); } Output: Z��rich Z?rich Z�ƒ¼rich Z� ================================================================= Java uses 16 bit to represent characters. Therefore the concept of character encoding is only applicable when working with arrays of bytes, 8 bit units, that represent a sequence of characters. One indeed needs to take character encoding into account when converting from byte[] to String or visa versa. However, converting from Unicode String to an array of bytes to a Unicode String using different encoding (especially in one method call), in my opinion, does not produce any sensible results. If you see things differently, please help me understand what URIUtil.toUsingCharset() can be USEFUL for Cheers Oleg -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
