[ 
https://issues.apache.org/jira/browse/COCOON-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Verwer updated COCOON-2297:
--------------------------------

    Attachment: HTMLTransformer.patch

The patch that fixes the issue described.

> Character encoding does not follow JTidy properties
> ---------------------------------------------------
>
>                 Key: COCOON-2297
>                 URL: https://issues.apache.org/jira/browse/COCOON-2297
>             Project: Cocoon
>          Issue Type: Bug
>          Components: Blocks: HTML
>    Affects Versions: 2.1.11
>            Reporter: Nico Verwer
>         Attachments: HTMLTransformer.patch
>
>
> The text that HTMLTransformer sends to JTidy is always encoded according tot 
> the platform default encoding, by calling text.getBytes() without an encoding 
> parameter. JTidy does not follow the platform default encoding, but has its 
> own default. It is possible to change JTidy's input encoding in the 
> properties file.
> The patch uses the encoding specified by JTidy's configuration.
> The result is that HTMLTransformer handles UTF-8 or other encodings 
> correctly, so you don't get Chinese characters where you expected a 
> diacritical mark.
> While I was changing the code, I also changed the logging settings. They now 
> take the settings in the JTidy configuration into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to