Hi,

I'm testing a bit with tapestry 5.4 (latest alpha). As soon as I have utf8 characters in my tml i get an exception concerning invalid utf8 bytes. This happens at least on windows. I assume it's a bug in XMLTokenStream (https://git-wip-us.apache.org/repos/asf?p=tapestry-5.git;a=blob_plain;f=tapestry-core/src/main/java/org/apache/tapestry5/internal/services/XMLTokenStream.java;hb=HEAD) within openStream.

On Tapestry 5.3, both reader and writer use system encoding so encoding will be intact.

On 5.4 the reader ist opened with "utf-8", the writer with system encoding which might mess up resulting encoding. That was discussed a while ago in http://mail-archives.apache.org/mod_mbox/tapestry-dev/201201.mbox/%3ccafaqxjuw4agkr-9cbtpb4qvvgchvevhiwp0vdw7n6ewzu6k...@mail.gmail.com%3E but never changed.

It would be great if this can be fixed. I could not test an in-place-fix (using service overrides or so) because the class is directly instanciated rather than injected. Basically the fix would be to change the line

   PrintWriter writer = new PrintWriter(bos);

to

   PrintWriter writer = new PrintWriter(new OutputStreamWriter(bos,
   "UTF8"));

(Is not 100% equal because the upper constructor wraps the OSW into a BufferedWriter but this should not make sense when writing to a BAOS.

Thank you,
Michael.

Reply via email to