[
https://issues.apache.org/jira/browse/CAMEL-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Claus Ibsen updated CAMEL-11846:
--------------------------------
Priority: Minor (was: Major)
> xtokenize and apply xslt to a string does not work with UTF-16BE
> -----------------------------------------------------------------
>
> Key: CAMEL-11846
> URL: https://issues.apache.org/jira/browse/CAMEL-11846
> Project: Camel
> Issue Type: Bug
> Components: camel-core
> Affects Versions: 2.17.5
> Reporter: Robert Half
> Priority: Minor
> Attachments: UTF-16BE (with BOM).png, my example looks like this
> (and it's really UTF-16BE).png
>
>
> In XML, encoding is often provided inside <?xml ..?> tag. In general, you
> cannot read the tag, if you don't know the encoding, but XML Parsers support
> the detection of several encodings which allows them to read the tag. With
> that information they can read the whole file without knowing the "charset"
> in first place.
> xtokenize and xslt use XmlInputFactory#createXmlStreamReader(Reader). But by
> providing a reader Camel tells, that it knows the encoding, so it won't be
> detected by the XML parser.
> Also Camel sets the charset to UTF-8 if it is not provided inside a header.
> This makes the underlying reader fail reading UTF-16.
> Using XmlInputFactory#createXmlStreamReader(InputStream) inside
> XMLTokenExpressionIterator works (tried in a patch). But the next xslt steps
> fails again because it again uses a Reader.
> See Stackoverflow Question for reference:
> [https://stackoverflow.com/questions/46322376/apache-camel-to-handle-encoding-declared-in-xml-file]
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)