Luc Maisonobe a écrit :
> Nikola Petrović a écrit :
>> Hello all,
>> I have a problem with base64 codec and encoding ascii characters. Here`s the 
>> code:
>> int ok=0, bad=0;
>>                 for(int i=128; i<256; i++){
>>                         char c = (char)i;
>>                         str = "" + c;
>>                         String enc = new 
>> String(Base64.encodeBase64(str.getBytes()));
>>                         String dec = new 
>> String(Base64.decodeBase64(enc.getBytes()));
>>                         if(str.equals(dec)) ok++;
>>                         else bad++;
>>                 }
>>                 System.out.println("ok: " + ok);
>>                 System.out.println("bad: " + bad);
>>
>> I get this:
>> ok: 80
>> bad: 48
>>
>> So, simple encoding and decoding of ascii chars > 128 doesn`t work for me. 
>> Is there some explanation?
> 
> Both the String.getBytes() and the String constructors from bytes rely
> on some charset to convet characters to bytes. I guess your JVM
> configuration has some inconsistencies in the default charset. On my
> Linux box with a default charset set to UTF-8, your code works well.
> 
> For robustness, I suggest you explicitely set the charset as follows:
> 
>   String enc = new String(Base64.encodeBase64(str.getBytes("UTF-8")),
>                           "US-ASCII");
>   String dec = new String(Base64.decodeBase64(enc.getBytes("US-ASCII")),
>                           "UTF-8");

You should also have a look at the str = "" + c statement. I guess it
also depends on the default charset. UTF-8 uses at least two bytes
sequences for special characters, whereas a single byte from 128 to 256
is valid only in some encodings (like ISO-8859-x), with encoding
specific meaning. For example in ISO-8859-1 the character 190 (0xBE) is
the 3/4 character (&frac34; in HTML), but in ISO-8859-15 it is the
capital Y with diaresis (&Yuml; in HTML).

Beware that building a Sting with invalid bytes sequences lead to
undefined results.

Luc

> 
> Luc
> 
> 
>> I tried this on os x, java5 and java6, doesn`t work, and on java6 on solaris 
>> pc it does. That confuses me totally :)
>>
>> The other part of the problem is probably linked to this one. I`m trying to 
>> encode a string in my java code, and decode it in some php code that I use. 
>> Is that possible? I guess since base64 is a standard it should work, and yet 
>> it doesn`t on os x, nor on solaris :(   Maybe it is linked somehow to the 
>> ascii char problem?
>>
>> Thanks,
>> Nikola
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to