Sung-Gu

Please take no offense, but URIUtil.toUsingCharset method still does not make even 
slightest sense to me. Your example shows how to invoke this method but does not 
explain what it is useful for, apart from garbling unicode strings  

Have a look at a simpler example. Here I attempt to (supposedly) convert "Z�rich" from 
one encoding into another. However, as you can see URIUtil.toUsingCharset() always 
produces garbage

===================================================================
public static void main(String[] args) throws Exception
{
  System.out.println(
    URIUtil.toUsingCharset("Z�rich", "UTF-8", "US-ASCII")); 
  System.out.println(
    URIUtil.toUsingCharset("Z�rich", "ASCII", "UTF-8")); 
  System.out.println(
    URIUtil.toUsingCharset("Z�rich", "UTF-8", "ISO-8859-1")); 
  System.out.println(
    URIUtil.toUsingCharset("Z�rich", "ISO-8859-1", "UTF-8")); 
}


Output:

Z��rich
Z?rich
Z�ƒ¼rich
Z�

=================================================================

Java uses 16 bit to represent characters. Therefore the concept of character encoding 
is only applicable when working with arrays of bytes, 8 bit units, that represent a 
sequence of characters. One indeed needs to take character encoding into account when 
converting from byte[] to String or visa versa. However, converting from Unicode 
String to an array of bytes to a Unicode String using different encoding (especially 
in one method call), in my opinion, does not produce any sensible results. 

If you see things differently, please help me understand what URIUtil.toUsingCharset() 
can be USEFUL for

Cheers

Oleg

--
To unsubscribe, e-mail:   
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: 
<mailto:[EMAIL PROTECTED]>

Reply via email to