These vulnerabilities are only an issue if you allow untrusted users to supply 
XML documents with DTDs.

If your system must allow users to submit XML documents with DTDs, then you 
probably want to pre-parse them before supplying them to BaseX, i.e., using a 
Java parser or Python with lxml or similar, where the entity-related 
vulnerabilities can be prevented or isolated. That is, your site can provide an 
upload target that preprocesses XML documents in order to sanitize them before 
submitting to BaseX.

One limitation I’ve run into with BaseX’s built-in parser is that it does not 
implement use of Apache’s grammar cache feature, which makes it very 
inefficient for documents with large DTDs, like DITA documents.

My solution is to simply not use DTD-aware parsing, which works for DITA 
because we know what all the default attribute values are for a given tag name 
and are not dependent on any other DTD-specific feature (i.e., DITA doesn’t use 
external general entities for any defined purpose, like references to images or 
something).

Cheers,

E.

_____________________________________________
Eliot Kimber
Sr. Staff Content Engineer
O: 512 554 9368

servicenow

servicenow.com<https://www.servicenow.com>
LinkedIn<https://www.linkedin.com/company/servicenow> | 
X<https://twitter.com/servicenow> | 
YouTube<https://www.youtube.com/user/servicenowinc> | 
Instagram<https://www.instagram.com/servicenow>

From: Nico Verwer (Rakensi) <nver...@rakensi.com>
Date: Thursday, March 13, 2025 at 5:26 PM
To: basex-talk@mailman.uni-konstanz.de <basex-talk@mailman.uni-konstanz.de>
Subject: [basex-talk] Protecting against XML vulnerabilities
[External Email]

________________________________
I am trying to protect my BaseX application from XML vulnerabilities, like the 
ones described in 
[[cid:part1.C3XloVPk.mrM90CZN@rakensi.com]https://gist.github.com/mgeeky/4f726d3b374f0a34267d4f19c9004870<https://gist.github.com/mgeeky/4f726d3b374f0a34267d4f19c9004870>]
 and 
[[cid:part2.1u0Y0Abz.C9LBjvp8@rakensi.com]https://learn.microsoft.com/en-us/archive/msdn-magazine/2009/november/xml-denial-of-service-attacks-and-defenses<https://learn.microsoft.com/en-us/archive/msdn-magazine/2009/november/xml-denial-of-service-attacks-and-defenses>].

My application runs as `basexhttp` inside a docker container, and I set the 
options in web.xml:
<context-param>
<param-name>org.basex.dtd</param-name>
<param-value>false</param-value>
</context-param>
<context-param>
<param-name>org.basex.xinclude</param-name>
<param-value>false</param-value>
</context-param>

I have not found other options, for example to let the parser limit expansion 
of internal entities.
Is there a way to set parser properties like 
`jdk.xml.entityExpansionLimit[cid:part3.XDNlJJl7.oyzsfLMY@rakensi.com]` in 
BaseX?

Reply via email to