[ https://issues.apache.org/jira/browse/XALANJ-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811569#comment-17811569 ]
Cédric Damioli commented on XALANJ-2625: ---------------------------------------- [~kesh...@alum.mit.edu] this one should be resolved as duplicate of XALANJ-2618 > Text output in ISO-8859-1 in Java 11 > ------------------------------------- > > Key: XALANJ-2625 > URL: https://issues.apache.org/jira/browse/XALANJ-2625 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Xalan > Affects Versions: 2.7.2 > Reporter: Daniel van den Ouden > Assignee: Gary D. Gregory > Priority: Minor > > We're currently in the process of upgrading our builds from Java 8 to Java 11 > and we've run into the following issue: > Given the following XML > {noformat} > <?xml version="1.0" encoding="UTF-8"?> > <Settings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:noNamespaceSchemaLocation="../xsd/DBSettings.xsd"> > <Database> > <Type value="Oracle"/> > <Database value="UTF8"/> > <User name="fgi_user" password="fgi"/> > <Owner name="fgi_owner" password="fgi"/> > </Database> > </Settings> > {noformat} > and the following XSL > {noformat} > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > xmlns:fo="http://www.w3.org/1999/XSL/Format"> > <xsl:output method="text" version="1.0" encoding="ISO-8859-1" > indent="yes"/> > <xsl:template match="/"> > <xsl:text>db://</xsl:text> > <xsl:value-of select="/Settings/Database/Type/@value"> > </xsl:value-of> > <xsl:text>:</xsl:text> > <xsl:value-of select="/Settings/Database/User/@name" /> > <xsl:text>/</xsl:text> > <xsl:value-of > select="/Settings/Database/User/@password" /> > <xsl:text>@</xsl:text> > <xsl:value-of > select="/Settings/Database/Database/@value" /> > </xsl:template> > </xsl:stylesheet> > {noformat} > We would expect the output to be > {noformat} > db://Oracle:fgi_user/fgi@UTF8 > {noformat} > But with Java11, the output becomes > {noformat} > db://Oracle:fgi_user/fgi@UTF8 > {noformat} > And the console gets flooded with messages like > {noformat} > Attempt to output character of integral value 100 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 98 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 58 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 47 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 47 that is not represented in > specified output encoding of ISO-8859-1. > {noformat} > The problem seems to be caused by org.apache.xml.serializer.Encodings.java. > In loadEncodingInfo(), a properties file is read > (org.apache.xml.serializer.Encodings.properties) containing a Java encoding > name and the associated MIME name that may appear in a stylesheet. For > ISO-8859-1, it contains the following entries in this order: > {noformat} > ISO8859-1 ISO-8859-1 0x00FF > ISO8859_1 ISO-8859-1 0x00FF > 8859-1 ISO-8859-1 0x00FF > 8859_1 ISO-8859-1 0x00FF > {noformat} > the loadEncodingInfo() method iterates over these entries, but the order > differs between Java 8 and Java 11. > Java 8: > {noformat} > ISO8859-1 > 8859_1 > 8859-1 > ISO8859_1 > {noformat} > Java 11: > {noformat} > ISO8859-1 > ISO8859_1 > 8859_1 > 8859-1 > {noformat} > Every entry is put in the _encodingTableKeyJava map using the Java name as > key, and in the _encodingTableKeyMime hastable using the MIME name as key. > In our case, the method getEncodingInfo(String encoding) with "encoding" > having the value "ISO-8859-1". First the _encodingTableKeyJava map is > checked; it doesn't contain the key "ISO-8859-1". Then the > _encodingTableKeyMime map is checked, which contains the last entry that was > processed from the properties file with a matching MIME name. Then the Java > name of that entry is used to build a new EncodingInfo object and perform the > actual encoding using the String class. > The problem here is that with Java 11, the last entry from the properties > file is "8859-1". This is NOT an alias for the actual ISO-8859-1 encoding. > With Java 8, the last entry would be "ISO8859_1" which IS an alias for > ISO-8859-1. > The aliases as I found them are: > {noformat} > ISO-8859-1 > 819 > ISO8859-1 > l1 > ISO_8859-1:1987 > ISO_8859-1 > 8859_1 > iso-ir-100 > latin1 > cp819 > ISO8859_1 > IBM819 > ISO_8859_1 > IBM-819 > csISOLatin1 > {noformat} > Long story short: org.apache.xml.serializer.Encodings.properties contains > entries that are not valid Encoding aliases. Removing 8859-1 through 8859-9 > should fix it. > Changing _encodingTableKeyMime to contain multiple encodings per MIME would > be an option as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org