J-HowHuang opened a new pull request, #18659:
URL: https://github.com/apache/pinot/pull/18659

   ## Problem
   
   `hadoop-common` (and `hadoop-hdfs`) pull `org.eclipse.jetty` transitives 
(`jetty-server` -> `jetty-http`, plus `jetty-util`/`servlet`/`webapp`) for 
Hadoop's embedded `HttpServer2` web UI. `pinot-orc` and `pinot-parquet` only 
use `org.apache.hadoop.fs.Path` and `org.apache.hadoop.conf.Configuration` to 
read data — they never start an `HttpServer2` — so the Jetty jars only leak 
into runtime distributions and surface **CVE-2026-2332** (jetty-http request 
smuggling, `jetty-http <= 9.4.59`) on any image that bundles these plugins.
   
   Additionally, `hadoop-client-runtime` is a Hadoop uber-jar that bundles its 
own copy of Jetty (relocated to `org.apache.hadoop.shaded.org.eclipse.jetty`). 
A Maven `<exclusion>` cannot remove those classes since they are baked into the 
uber-jar rather than resolved as a dependency node, and the leftover `jetty-*` 
Maven metadata inside it still makes scanners flag CVE-2026-2332 in the 
`pinot-parquet` shaded jar.
   
   ## Changes
   
   1. **Exclude `org.eclipse.jetty` from `hadoop-common` and `hadoop-hdfs`** in 
both `pinot-orc` and `pinot-parquet`.
   2. **Scope `hadoop-mapreduce-client-core` to `test` in `pinot-parquet`** 
(documented "Used for Parquet Writer in tests"), removing its compile-scope 
`hadoop-yarn-client -> websocket-client -> jetty-client -> jetty-http` chain.
   3. **Add a shade-plugin filter in `pinot-parquet`** 
(`combine.children="append"`, preserving the inherited root-pom filters) that 
strips the relocated Jetty classes, their Maven metadata, and the orphaned 
`ServiceLoader` entries from the shaded jar.
   
   ## Verification
   
   - `dependency:tree` shows no compile/runtime-scope Jetty remains in either 
module while `hadoop-common`/`hadoop-hdfs` themselves still resolve.
   - The rebuilt `pinot-parquet` shaded jar contains zero `org.eclipse.jetty` 
entries while the shaded Woodstox classes (needed by Hadoop `Configuration`) 
remain.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to