Kevin Risden created KNOX-2202:
----------------------------------

             Summary: Knox should use UTF-8 as default encoding instead of 
ISO-8859-1
                 Key: KNOX-2202
                 URL: https://issues.apache.org/jira/browse/KNOX-2202
             Project: Apache Knox
          Issue Type: Bug
            Reporter: Kevin Risden
            Assignee: Kevin Risden
             Fix For: 1.4.0


If you send in an XML doc with unicode characters you get the following:

{code:java}
...
Caused by: com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog
 at [row,col {unknown-source}]: [1,0]
        at 
com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:687)
        at 
com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2220)
        at 
com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2126)
        at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1181)
        at 
org.codehaus.stax2.ri.Stax2EventReaderImpl.nextEvent(Stax2EventReaderImpl.java:255)
        at 
org.apache.knox.gateway.filter.rewrite.impl.xml.XmlFilterReader.read(XmlFilterReader.java:122)
        ... 133 more
{code}

Knox default falls back to ISO-8859-1 encoding instead of UTF-8.

I did some research and the default encoding specification has changed over the 
years. It looks like ISO-8859-1 was the default historically, but currently it 
should be UTF-8.

https://stackoverflow.com/questions/58337900/how-to-change-default-character-encoding-configuration-in-jetty-app-server-from

There are very few cases where ISO-8859-1 and UTF-8 are incompatible and it 
would be outside the default ASCII charset.

I also found that the default XML encoding is UTF-8 so even if we don't change 
all the defaults to UTF-8 we should do so for XML.

https://www.w3schools.com/xml/xml_syntax.asp



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to