I've further investigated the problem and found that it's reproducible on
IBM VM as well, but to reproduce it "-Dfile.encoding=ISO-8859-1" option
should be added. Indeed, lineSeparator field in java.io.PrintStream is
changed:

import java.nio.charset.Charset;
import java.lang.reflect.Field;
import java.io.PrintStream;

public class Test {
   public static void main(String[] args) throws Exception {
       Field f = PrintStream.class.getDeclaredField("lineSeparator");
       f.setAccessible(true);
       System.out.println("separator[0] before encoding: " + ((String)
f.get(System.out)).getBytes()[0]);
       Charset charset = Charset.forName("ISO-8859-1");
       charset.encode("\u3400");
       System.out.println("separator[0] after encoding: " + ((String) f.get
(System.out)).getBytes()[0]);
   }
}

Output on J9:
separator[0] before encoding: 13
separator[0] after encoding: 26⌂

Output on DRLVM:
separator[0] before encoding: 13
separator[0] after encoding: 26→

The problem that '\u3400' is encoded differently in RI/Harmony is described
in http://issues.apache.org/jira/browse/HARMONY-3307, but the problem with
changing lineSeparator is new and separate.

Thanks,
Mikhail

On 5/15/07, Mikhail Markov <[EMAIL PROTECTED]> wrote:

Hi!

While investigating H-3307 I've found a strange effect on DRLVM. The
following code:
import java.nio.charset.Charset;

public class Test {
    public static void main(String[] args) {
        System.out.println("print something...");
        Charset charset = Charset.forName("ISO-8859-1");
        charset.encode("\u3400");
        System.out.println("print something again...");
        System.out.println("and again...");
    }
}

prints additional symbols after charset.encode() line at the end of
messages in println():
print something...
print something again...→
and again...→

If i remove charset.encode() line then the output is ok:
print something...
print something again...
and again...

Another strange thing that if i remove first println line in the code
above, the last 2 println works ok, i.e. without any additional symbols

This effect is only reproducible on DRLVM. I'm not quite understand what
happens here.

Any thoughts?

Thanks,
Mikhail


Reply via email to