[ 
https://issues.apache.org/jira/browse/CAMEL-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Claus Ibsen updated CAMEL-11846:
--------------------------------
    Priority: Minor  (was: Major)

> xtokenize and apply xslt to a string does not work  with UTF-16BE
> -----------------------------------------------------------------
>
>                 Key: CAMEL-11846
>                 URL: https://issues.apache.org/jira/browse/CAMEL-11846
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.17.5
>            Reporter: Robert Half
>            Priority: Minor
>         Attachments: UTF-16BE (with BOM).png, my  example looks like this 
> (and  it's really UTF-16BE).png
>
>
> In XML, encoding is often provided inside <?xml ..?> tag. In general, you 
> cannot read the tag, if you don't know the encoding, but XML Parsers support 
> the detection of several encodings which allows them to read the tag. With 
> that information they can read the whole file without knowing the "charset" 
> in first place.
> xtokenize and xslt use XmlInputFactory#createXmlStreamReader(Reader). But by 
> providing a reader Camel tells, that it knows the encoding, so it won't be 
> detected by the XML parser.
> Also Camel sets the charset to UTF-8 if it is not provided inside a header. 
> This makes the underlying reader fail reading UTF-16.
> Using XmlInputFactory#createXmlStreamReader(InputStream) inside 
> XMLTokenExpressionIterator works (tried in a patch). But the next xslt steps 
> fails again because it again uses a Reader.
> See Stackoverflow Question for reference:
> [https://stackoverflow.com/questions/46322376/apache-camel-to-handle-encoding-declared-in-xml-file]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to