J-HowHuang opened a new pull request, #18659: URL: https://github.com/apache/pinot/pull/18659
## Problem `hadoop-common` (and `hadoop-hdfs`) pull `org.eclipse.jetty` transitives (`jetty-server` -> `jetty-http`, plus `jetty-util`/`servlet`/`webapp`) for Hadoop's embedded `HttpServer2` web UI. `pinot-orc` and `pinot-parquet` only use `org.apache.hadoop.fs.Path` and `org.apache.hadoop.conf.Configuration` to read data — they never start an `HttpServer2` — so the Jetty jars only leak into runtime distributions and surface **CVE-2026-2332** (jetty-http request smuggling, `jetty-http <= 9.4.59`) on any image that bundles these plugins. Additionally, `hadoop-client-runtime` is a Hadoop uber-jar that bundles its own copy of Jetty (relocated to `org.apache.hadoop.shaded.org.eclipse.jetty`). A Maven `<exclusion>` cannot remove those classes since they are baked into the uber-jar rather than resolved as a dependency node, and the leftover `jetty-*` Maven metadata inside it still makes scanners flag CVE-2026-2332 in the `pinot-parquet` shaded jar. ## Changes 1. **Exclude `org.eclipse.jetty` from `hadoop-common` and `hadoop-hdfs`** in both `pinot-orc` and `pinot-parquet`. 2. **Scope `hadoop-mapreduce-client-core` to `test` in `pinot-parquet`** (documented "Used for Parquet Writer in tests"), removing its compile-scope `hadoop-yarn-client -> websocket-client -> jetty-client -> jetty-http` chain. 3. **Add a shade-plugin filter in `pinot-parquet`** (`combine.children="append"`, preserving the inherited root-pom filters) that strips the relocated Jetty classes, their Maven metadata, and the orphaned `ServiceLoader` entries from the shaded jar. ## Verification - `dependency:tree` shows no compile/runtime-scope Jetty remains in either module while `hadoop-common`/`hadoop-hdfs` themselves still resolve. - The rebuilt `pinot-parquet` shaded jar contains zero `org.eclipse.jetty` entries while the shaded Woodstox classes (needed by Hadoop `Configuration`) remain. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
