Encodings.properties

Joe Kesselman (Jira) Fri, 26 Jan 2024 09:11:05 -0800


    [ 
https://issues.apache.org/jira/browse/XALANJ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811344#comment-17811344
 ]


Joe Kesselman commented on XALANJ-2618:
---------------------------------------

Or perhaps: Change the properties file so both java and mime name columns can 
be lists, with the rule that the first choice from the list is the preferred 
match.:

ISO8859-1,ISO8859_1,8859-1,1159_1     ISO-8859-1     0x00FF

and build the hashmaps so all the mime keys map to the first java key and all 
the java keys map to the first mime key.

May be overkill.

> Error in org/apache/xml/serializer/Encodings.properties
> -------------------------------------------------------
>
>                 Key: XALANJ-2618
>                 URL: https://issues.apache.org/jira/browse/XALANJ-2618
>             Project: XalanJ2
>          Issue Type: Bug
>      Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>          Components: Serialization, transformation
>    Affects Versions: 2.7.2
>         Environment: Java 11
>            Reporter: Simon Schaarschmidt
>            Assignee: Steven J. Hathaway
>            Priority: Major
>              Labels: Java11
>
> We transform and serialize using encoding ISO-8859-1. With JDK 1.8 all is 
> fine, but with OpenJDK 11 the result will be written (from class 
> ToTextStream) in character references, e.g. 
> "*&amp;#105;&amp;#100;&amp;#61;&amp;#49;*" instead of "*id=1*".
> In org/apache/xml/serializer/Encodings.properties (serializer.jar) are 
> various encodings defined, e.g.
> {{ISO8859-1  ISO-8859-1  0x00FF}}
> {{ISO8859_1  ISO-8859-1  0x00FF}}
> {{{color:#ff0000}8859-1{color}     ISO-8859-1  0x00FF}}
> {{{color:#ff0000}8859_1{color}     ISO-8859-1  0x00FF}}
> First value: Java encoding name
> Second value: comma separated preferred mime names.
> The class org.apache.xml.serializer.Encodings reads this file in a Properties 
> object and processes the definitions to create EncodingInfo objects and puts 
> them (see method loadEncodingInfo()) into the member fields 
> __encodingTableKeyJava_ and __encodingTableKeyMime_ (both Hashtable). 
> Especially putting Elements into _encodingTableKeyMime is critical because 
> there is not a 1:1 mapping and the latest returned Properties.keys() element 
> replaces the previous ElementInfo object.
> Until Java 1.8 the first line from above is the latest entry in Enumeration, 
> therefor _encodingTableKeyMime returns the EncodingInfo object with Java 
> encoding "{color:#14892c}ISO8859-1{color}" for encoding "ISO-8859-1". With 
> Java 11 the elements of the Enumeration returned by Properties.keys() has a 
> different order: the third line from above is the latest entry! Therefor 
> _encodingTableKeyMime returns the EncodingInfo object with Java encoding 
> "*{color:#ff0000}8859-1{color}*" when asking for encoding "ISO-8859-1". But: 
> "8859-1" ist not a valid Java encoding name! Method 
> EncodingInfo.inEncoding(char,String) fails internally with an 
> *UnsupportedEncodingException* and returns false.
> The methods in class Encodings first searches EncodingInfo object in 
> _encodingTableKeyJava and uses elements from _encodingTableKeyMime as 
> fallback.
> I suggest the definitions in Encodings.properties must be extended with 
> additional lines, e.g.
> {{*{color:#14892c}ISO-8859-1{color}* ISO-8859-1  0x00FF}}
> Also for encodings ISO-8859-2..9. Or all entries with Java encoding name 
> "8859*" should be removed. (They are not valid Java encoding names - 
> UnsupportedEncodingException!)
> Finally I think, the current mechanism of collecting the EncodingInfo objects 
> using two Hashtables is critical.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (XALANJ-2618) Error in org/apache/xml/serializer/Encodings.properties

Reply via email to