ppkarwasz opened a new pull request, #5:
URL: https://github.com/apache/commons-xml/pull/5

   ## Summary
   
   Two related changes to DOM (`DocumentBuilderFactory`) hardening.
   
   ### 1. Capability-driven hardening
   
   Implements @elharo's suggestion in the [`xml-commons-dev@xerces` 
thread](https://lists.apache.org/thread/2yct4mkq9v0jf5xx3om1nbwbqqtgtbnm):
   
   > Why not set all relevant features and properties on each and check that 
you succeeded in configuring a minimal set to provide security?
   
   Replaces the per-implementation class-name dispatch for DOM with a single 
capability-driven recipe:
   
   - Secure processing (FSP) is required; the external-DTD subset is skipped 
where supported.
   - Whether the JAXP 1.5 `accessExternal*` properties are honoured then 
decides whether the bare factory is already safe or needs a deny-all resolver 
wrapper. This is the only point where the stock JDK and the external Xerces 
distribution diverge.
   - Android stays untouched (its parser exposes no hardening surface).
   - An implementation is no longer rejected for being unrecognised: any parser 
that accepts secure processing and either the access properties or the resolver 
wrapper satisfies the contract. Only a parser that refuses secure processing 
now fails.
   - Adds a test that an `xsi:schemaLocation` hint is not fetched during 
DOM-side XSD validation (gated on parsers that honour `accessExternalSchema`).
   
   ### 2. Minimal shading entry point
   Adds a public `DocumentBuilderHardener.newInstance()` so consumers that only 
need a hardened `DocumentBuilderFactory` can shade the library and copy a 
minimal set of classes.
   
   - `jdependency` (the engine `maven-shade-plugin`'s `minimizeJar` uses) works 
at class granularity, so the shaded set is the transitive closure of this one 
class.
   - A deny-all `EntityResolver` is installed as a local lambda instead of 
reusing `Resolvers`, which drops the whole `Resolvers` nested-class tree from 
the closure (12 -> 7 class files).
   - `XmlFactories.newDocumentBuilderFactory()` is routed through the new entry 
point.
   - `ShadingFootprintTest` uses `jdependency` to pin the reachable set to 
those 7 classes, so the footprint cannot silently grow back toward the full 
library.
   - Javadoc on both methods documents how to enable XInclude: it is held off 
by the deny-all external-fetch behavior, not the awareness flag, so enabling it 
also requires a custom `EntityResolver`.
   
   ## Testing
   
   `mvn -o test` (all 7 JAXP-combination executions) passes with 0 failures and 
0 errors. The 2 skips are `SchemaLocationDomTest` under external Xerces, which 
does not honour `accessExternalSchema`.
   
   ## Follow-ups (separate PRs)
   
   - A deeply-nested-document (element-depth) test, to cover the limit FSP 
leaves unbounded on JDK 8-21.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to