[ 
https://issues.apache.org/jira/browse/SOLR-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18043923#comment-18043923
 ] 

Piotr Karwasz commented on SOLR-17888:
--------------------------------------

I think the simplest and most effective mitigation is to remove {{xercesImpl}} 
from the Solr runtime. Not only does it exhibit the least secure behavior, but 
it is also unnecessary: two other StAX implementations (Woodstox and the JDK’s 
Xerces fork) are already present on the classpath.

The default behavior of the available StAX parsers is as follows:

* *JDK fork of Xerces*:  
  If you are willing to drop Woodstox as well, the JDK StAX implementation is 
the *only* one that supports [JAXP 1.5 (JEP 185)|https://openjdk.org/jeps/185]. 
JAXP 1.5 allows control over *external protocol access* via the three 
{{javax.xml.accessExternal*}} properties (see [Oracle's 
documentation|https://docs.oracle.com/en/java/javase/25/security/java-api-xml-processing-jaxp-security-guide.html#GUID-E345AA09-801E-4B95-B83D-7F0C452538AA]).
 Oddly enough, for DOM and SAX these properties are empty by default (because 
{{FEATURE_SECURE_PROCESSING}} is enabled), but for StAX they default to “allow 
all”, which is the opposite of what you’d expect. To harden StAX, explicitly 
set the following properties in Solr’s startup script or in a 
{{jaxp.properties}} file referenced by it:

{noformat}
javax.xml.accessExternalDTD=
javax.xml.accessExternalSchema=
javax.xml.accessExternalStylesheet=
{noformat}

* *Woodstox*:  
  Woodstox does *not* resolve external DTDs since Tika uses a custom resolver, 
but it *does* attempt to resolve external entities. It does so using only the 
*entity name* (e.g., {{ext}}) rather than the system ID, which effectively 
prevents access outside the current working directory. Woodstox does *not* 
[implement JAXP 1.5|https://github.com/FasterXML/woodstox/issues/51].

* *Xerces*:  
  This is the *least secure* of the three. When the custom resolver returns an 
empty result, Xerces falls back to its default resolver. Unlike the JDK fork, 
it does *not* implement JAXP 1.5 (see XERCESJ-1654), so it cannot be controlled 
via the {{javax.xml.accessExternal*}} properties.

*TL;DR*:  
* Remove {{xercesImpl}}.
* Consider also removing Woodstox.  
* Then set the three JEP 185 properties.  

After that, you can safely ignore most XXE / SSRF CVEs, because the relevant 
attack paths are closed.


> Mitigate CVE-2025-54988, affecting Tika 1.x PDFParser
> -----------------------------------------------------
>
>                 Key: SOLR-17888
>                 URL: https://issues.apache.org/jira/browse/SOLR-17888
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 9.8, 9.9
>            Reporter: Dhoka Pramod
>            Assignee: Jan Høydahl
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to