Hi Folks, A few more details about this vulnerability, plus a workaround:
SAXON has a configuration property allowedProtocols that can be set to "https,http" to allow only HTTPS and HTTP URIs to be resolved, while file URI access should fail (i.e., access to the file should be blocked). However, when allowedProtocols is set, SAXON fails to block file access when the string given to parse-xml() contains a user-defined entity--via an ENTITY declaration (in a DOCTYPE or in a DTD)--and the entity references a file. Workaround: prevent parse-xml() from doing any DTD/DOCTYPE access; disable DTD/DOCTYPE (https://saxonica.plan.io/issues/6711#note-12) -----Original Message----- From: Roger L Costello <coste...@mitre.org> Sent: Saturday, March 8, 2025 8:07 AM To: users@daffodil.apache.org Subject: Alert: vulnerability in the SAXON XSLT processor Below is an XSLT program that reads the Windows/win.ini file. A bad actor could use the program to read and display the contents of any file on your machine. This is a vulnerability. The vulnerability affects all releases of SAXON. The SAXON team is working to fix this vulnerability. Explanation of how the vulnerability works Sometimes you write an XSLT program that dynamically builds XML. The XML--which is a string--may then be dynamically processed using the XPath parse-xml(string) function. Let's dig into dynamically generated XML that can read arbitrary files on your machine. Recall that XML has five built-in entities: lt for the < symbol, gt for the > symbol, amp for the ampersand symbol, quote for the " symbol, and apos for the ' symbol. You can create your own user-defined entities using <!ENTITY args>, where args is the name of the new entity--e.g., xxe (not a very readable entity name, that's okay)--followed by the value for the entity. The value may be given in-line as a string, or a file may be referenced to provide the value. Let's assign xxe the value of the Windows/win.ini file. Follow xxe with the keyword SYSTEM and then the location to the file. Here's how to create a user-defined xxe entity whose value is the content of the Windows/win.ini file: <!ENTITY xxe SYSTEM "file:///Windows/win.ini"> Place that entity declaration inside a DOCTYPE declaration: <!DOCTYPE root [ <!ENTITY xxe SYSTEM "file:///Windows/win.ini"> ]> The DOCTYPE comes before the XML document's root element. Here is XML which uses--displays--the value of the xxe entity: <root>&xxe;</root> With that technical background, the following XSLT program should be understandable. ---------------------------------------------------------------------- XSLT program that could be exploited to read--and output--any file on your machine. ---------------------------------------------------------------------- <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" version="3.0"> <xsl:template match="/"> <Results> <xsl:sequence select=" parse-xml( ' <!DOCTYPE root [ <!ENTITY xxe SYSTEM "file:///Windows/win.ini"> ]> <root>&xxe;</root> ' ) "/> </Results> </xsl:template> </xsl:stylesheet>