Kevin Risden created KNOX-2202:
----------------------------------
Summary: Knox should use UTF-8 as default encoding instead of
ISO-8859-1
Key: KNOX-2202
URL: https://issues.apache.org/jira/browse/KNOX-2202
Project: Apache Knox
Issue Type: Bug
Reporter: Kevin Risden
Assignee: Kevin Risden
Fix For: 1.4.0
If you send in an XML doc with unicode characters you get the following:
{code:java}
...
Caused by: com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog
at [row,col {unknown-source}]: [1,0]
at
com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:687)
at
com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2220)
at
com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2126)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1181)
at
org.codehaus.stax2.ri.Stax2EventReaderImpl.nextEvent(Stax2EventReaderImpl.java:255)
at
org.apache.knox.gateway.filter.rewrite.impl.xml.XmlFilterReader.read(XmlFilterReader.java:122)
... 133 more
{code}
Knox default falls back to ISO-8859-1 encoding instead of UTF-8.
I did some research and the default encoding specification has changed over the
years. It looks like ISO-8859-1 was the default historically, but currently it
should be UTF-8.
https://stackoverflow.com/questions/58337900/how-to-change-default-character-encoding-configuration-in-jetty-app-server-from
There are very few cases where ISO-8859-1 and UTF-8 are incompatible and it
would be outside the default ASCII charset.
I also found that the default XML encoding is UTF-8 so even if we don't change
all the defaults to UTF-8 we should do so for XML.
https://www.w3schools.com/xml/xml_syntax.asp
--
This message was sent by Atlassian Jira
(v8.3.4#803005)