Joshua Maurice created XALANJ-2613:
--------------------------------------

             Summary: TransformerIdentityImpl doesn't properly handle file URIs 
with percent-encoded Unicode characters
                 Key: XALANJ-2613
                 URL: https://issues.apache.org/jira/browse/XALANJ-2613
             Project: XalanJ2
          Issue Type: Bug
      Security Level: No security risk; visible to anyone (Ordinary problems in 
Xalan projects.  Anybody can view the issue.)
          Components: transformation
    Affects Versions: 2.7.2
         Environment: I tested on the following system:

$ cat /etc/centos-release
CentOS Linux release 7.4.1708 (Core)
$ uname -a
Linux jjmdeskvm.informatica.com 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 
20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ env | grep -E '^LANG'
LANG=en_US.UTF-8
$ env | grep -E '^LC'
$
            Reporter: Joshua Maurice
            Assignee: Steven J. Hathaway
             Fix For: The Latest Development Code
         Attachments: Repro.java, runtest.sh

When using Xalan, and javax.xml.transform.Transformer, with a 
javax.xml.transform.stream.StreamResult constructed from a java.io.File object 
that contains Unicode characters, the Transformer will create an output file 
with the wrong file path.

I have attached a very small repro, which is a very small Java file and a very 
small bash script used to compile and run the test, and print out a few 
relevant environmental details.

 

The cause of the bug is this:

When constructing a StreamResult object by passing a File object to the 
constructor, the StreamResult object saves a string representation of the URI 
object created from the File object. This string representation of the URI is 
properly formatted, which means that the individual path elements of the path 
of the URI are properly percent-encoded. The Xalan TransformerImpl class calls 
getSystemId on StreamResult to get this string representation of the URI, and 
it simply strips off the leading "file://" prefix, and uses the remainder to 
create a FileOutputStream object. However, the remainder of the string is the 
result of URI percent-encoding, and as such, it is not suitable for directly 
passing to FileOutputStream. Instead, the code here must use a URI utility to 
properly interpret the URI string, and to undo the percent-encoding, to obtain 
a string that is suitable for creating a FileOutputStream object.

When the file path contains only ASCII characters, percent-encoding does 
nothing, which means that the code works with ASCII. However, as soon as any 
other Unicode character is part of the file path, then it breaks by writing to 
the wrong file path.

Because it writes to the wrong file path which may silently succeed, this may 
have security concerns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to