Sung-Gu
Please take no offense, but URIUtil.toUsingCharset method still does not make even
slightest sense to me. Your example shows how to invoke this method but does not
explain what it is useful for, apart from garbling unicode strings
Have a look at a simpler example. Here I attempt to (supposedly) convert "Z�rich" from
one encoding into another. However, as you can see URIUtil.toUsingCharset() always
produces garbage
===================================================================
public static void main(String[] args) throws Exception
{
System.out.println(
URIUtil.toUsingCharset("Z�rich", "UTF-8", "US-ASCII"));
System.out.println(
URIUtil.toUsingCharset("Z�rich", "ASCII", "UTF-8"));
System.out.println(
URIUtil.toUsingCharset("Z�rich", "UTF-8", "ISO-8859-1"));
System.out.println(
URIUtil.toUsingCharset("Z�rich", "ISO-8859-1", "UTF-8"));
}
Output:
Z��rich
Z?rich
Z�ƒ¼rich
Z�
=================================================================
Java uses 16 bit to represent characters. Therefore the concept of character encoding
is only applicable when working with arrays of bytes, 8 bit units, that represent a
sequence of characters. One indeed needs to take character encoding into account when
converting from byte[] to String or visa versa. However, converting from Unicode
String to an array of bytes to a Unicode String using different encoding (especially
in one method call), in my opinion, does not produce any sensible results.
If you see things differently, please help me understand what URIUtil.toUsingCharset()
can be USEFUL for
Cheers
Oleg
--
To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>