Character encoding does not follow JTidy properties
---------------------------------------------------

                 Key: COCOON-2297
                 URL: https://issues.apache.org/jira/browse/COCOON-2297
             Project: Cocoon
          Issue Type: Bug
          Components: Blocks: HTML
    Affects Versions: 2.1.11
            Reporter: Nico Verwer


The text that HTMLTransformer sends to JTidy is always encoded according tot 
the platform default encoding, by calling text.getBytes() without an encoding 
parameter. JTidy does not follow the platform default encoding, but has its own 
default. It is possible to change JTidy's input encoding in the properties file.

The patch uses the encoding specified by JTidy's configuration.

The result is that HTMLTransformer handles UTF-8 or other encodings correctly, 
so you don't get Chinese characters where you expected a diacritical mark.

While I was changing the code, I also changed the logging settings. They now 
take the settings in the JTidy configuration into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to