Re: [PR] add OpenNLP CVE's Vex files [solr-site]

via GitHub Thu, 18 Jun 2026 05:48:28 -0700


Copilot commented on code in PR #192:
URL: https://github.com/apache/solr-site/pull/192#discussion_r3435864585



##########
content/solr/vex/2026-05-04-cve-2026-42440.md:
##########
@@ -0,0 +1,49 @@
+---
+cve: CVE-2026-42440
+category:
+  - solr/vex
+versions: "< 10.1.0"
+jars:
+  - opennlp-tools-1.9.4.jar
+analysis:
+  state: exploitable
+  response:
+    - workaround_available
+    - update
+title: "Apache OpenNLP: Out-of-memory denial of service via crafted model file"
+---
+
+CVE-2026-42440 (CVSS 7.5) is an out-of-memory denial-of-service issue in 
Apache OpenNLP's binary
+model reader. The `AbstractModelReader` methods `getOutcomes()`, 
`getOutcomePatterns()` and
+`getPredicates()` read a 32-bit signed integer count field from a binary model 
stream and pass it
+directly to an array allocation without validating it. An attacker who can 
supply a crafted `.bin`
+model file with a count set to `Integer.MAX_VALUE` triggers an immediate 
`OutOfMemoryError` during
+model deserialization, before any substantial data is consumed.
+
+The vulnerable code is present in the `opennlp-tools-1.9.4.jar` that Solr 
pulls in transitively via
+`lucene-analysis-opennlp` (Lucene 9.12.3). The 1.9.x line is end-of-life, so 
there is no patched 1.9
+release; the issue is fixed in OpenNLP 2.5.9 (and 3.0.0-M3), which validate 
the count against an
+upper bound (default 10,000,000, configurable via the `OPENNLP_MAX_ENTRIES` 
system property) before

Review Comment:
   The front matter/body states “The 1.9.x line is end-of-life, so there is no 
patched 1.9 release”, but the PR description notes the OpenNLP team is 
backporting fixes and plans to release 1.9.5. To avoid contradicting that, 
reword this to indicate there is not *yet* a patched 1.9.x release (or qualify 
it with timing).



##########
content/solr/vex/2026-05-04-cve-2026-40682.md:
##########
@@ -0,0 +1,57 @@
+---
+cve: CVE-2026-40682
+category:
+  - solr/vex
+versions: "< 10.1.0"
+jars:
+  - opennlp-tools-1.9.4.jar
+analysis:
+  state: exploitable
+  response:
+    - workaround_available
+    - update
+title: "Apache OpenNLP: XXE in dictionary parsing"
+---
+
+CVE-2026-40682 (CVSS 9.1) is an XML External Entity (XXE) vulnerability in 
Apache OpenNLP's
+dictionary parsing. The `DictionaryEntryPersistor` class and the public 
`Dictionary(InputStream)`
+constructor create a SAX parser without enabling `FEATURE_SECURE_PROCESSING` 
or disabling DTD
+processing, so external entity resolution and DOCTYPE declarations remain 
fully enabled. An attacker
+who can supply a crafted dictionary file — either directly or embedded in a 
model archive that
+OpenNLP deserializes — can therefore read local files from the server or 
trigger outbound requests
+(server-side request forgery). Other OpenNLP XML parsing paths route through 
the hardened
+`XmlUtil.createSaxParser()` helper, but this code path does not.
+
+The vulnerable code is present in the `opennlp-tools-1.9.4.jar` that Solr 
pulls in transitively via
+`lucene-analysis-opennlp` (Lucene 9.12.3). The 1.9.x line is end-of-life, so 
there is no patched 1.9
+release; the issue is fixed in OpenNLP 2.5.9 (and 3.0.0-M3), which parse 
dictionaries with secure

Review Comment:
   The text asserts “The 1.9.x line is end-of-life, so there is no patched 1.9 
release”, but the PR description mentions an upcoming OpenNLP 1.9.5 backport. 
Reword this to reflect the current state (“not yet”) rather than implying there 
won’t be a patched 1.9.x release.



##########
content/solr/vex/2026-05-04-cve-2026-42027.md:
##########
@@ -0,0 +1,66 @@
+---
+cve: CVE-2026-42027
+category:
+  - solr/vex
+versions: "< 10.1.0"
+jars:
+  - opennlp-tools-1.9.4.jar
+analysis:
+  state: exploitable
+  response:
+    - workaround_available
+    - update
+title: "Apache OpenNLP: Arbitrary class instantiation via model manifest"
+---
+
+CVE-2026-42027 (CVSS 9.8) is an arbitrary class instantiation issue in Apache 
OpenNLP's
+`ExtensionLoader`. The `instantiateExtension(Class, String)` method loads a 
class named in a
+model archive's `manifest.properties` via `Class.forName()` and only performs 
its
+`isAssignableFrom` type check *after* the class has been loaded. Because 
`Class.forName()`
+runs the target class's static initializer at load time, an attacker who can 
supply a crafted
+model archive can trigger the static initializer of any class on the classpath 
(e.g. one that
+performs a JNDI lookup, outbound network I/O, or filesystem access), 
regardless of the
+type check that follows.
+
+The vulnerable code is present in the `opennlp-tools-1.9.4.jar` that Solr 
pulls in transitively
+via `lucene-analysis-opennlp` (Lucene 9.12.3). The 1.9.x line is end-of-life, 
so there is no
+patched 1.9 release; the issue is fixed in OpenNLP 2.5.9 (and 3.0.0-M3), which 
consults a

Review Comment:
   This file says “The 1.9.x line is end-of-life, so there is no patched 1.9 
release”, but the PR description indicates an OpenNLP 1.9.5 backport is in 
progress. Please soften this wording to “not yet” (or otherwise qualify) so it 
doesn’t read as a definitive ‘will never be patched’.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] add OpenNLP CVE's Vex files [solr-site]

Reply via email to