ramackri opened a new pull request, #1019:
URL: https://github.com/apache/ranger/pull/1019

   Fixes [RANGER-5646](https://issues.apache.org/jira/browse/RANGER-5646): Hive 
plugin audit delivery to the central audit ingestor fails with **HTTP 401** 
when duplicate libraries in `lib/ranger-hive-plugin-impl/` conflict with 
HiveServer2’s classpath.
   
   ### Problem
   
   When `xasecure.audit.destination.auditserver=true`, the Hive plugin uses 
`RangerAuditServerDestination` (Jersey 2 REST client) to POST audits to the 
audit ingestor. Ranger plugin assemblies **whitelist** dependencies into 
`lib/ranger-hive-plugin-impl/`; anything listed is loaded on the isolated 
plugin classloader.
   
   The Hive assembly currently whitelists JARs that HiveServer2 **already 
provides at different versions** on its application classpath (Jackson 2.17 
from Ranger build vs 2.16 on Hive 4.x, httpclient/httpcore, hppc, 
commons-collections, etc.). Version skew across classloaders breaks the Jersey 
audit client’s auth/serialization path:
   
   ```
   Failed to send audit batch … HTTP 401
   Authentication failure
   ```
   
   This is **not** an ingestor allowlist issue (that would be HTTP **403**). 
Removing the duplicate JARs from `ranger-hive-plugin-impl/` after install 
reproduces the fix locally.
   
   **Related but separate:** 
[#1015](https://github.com/apache/ranger/pull/1015) (RANGER-5642/5644) **adds** 
missing Jersey JARs for Kafka/HBase (`MessageBodyWriter not found`). This PR 
**removes** duplicate/wrong-version JARs for Hive — opposite direction, same 
auditserver destination.
   
   ### Solution
   
   Tighten the `lib/ranger-hive-plugin-impl` whitelist in 
`distro/src/main/assembly/hive-agent.xml`:
   
   - **Remove** libraries Hive/Hadoop already ship (or that must not be 
duplicated on the plugin classloader): `hppc`, Ranger-pinned Jackson 2.17 
(`jackson-core`, `jackson-databind`, `jackson-annotations`, `jackson-jaxrs-*`), 
`httpclient`, `httpcore`, `httpcore-nio`, `commons-collections`, 
`javax.annotation-api`, `joda-time`, duplicate 
`jackson-module-jaxb-annotations` entries.
   - **Keep** Jersey audit-server client stack, audit-core/dest-auditserver 
module JARs, Graal/ICU (policy engine), Solr/Jetty/httpasyncclient/httpmime 
where not duplicated by Hive lib.
   - **Pin** `jackson-module-jaxb-annotations:2.16.1` (Hive 4.x–aligned JAXB 
support for Jersey JSON) instead of `${fasterxml.jackson.version}` (2.17).
   
   No Java source, POM, or Docker changes — **packaging only**.
   
   ### Changes
   
   | Area | File | Change |
   |------|------|--------|
   | Hive packaging | `distro/src/main/assembly/hive-agent.xml` | Filter 
`lib/ranger-hive-plugin-impl` whitelist: drop Hive/Hadoop duplicate deps; pin 
`jackson-module-jaxb-annotations:2.16.1` |
   
   #### Removed from plugin-impl whitelist
   
   | Maven coordinate | Why removed |
   |------------------|-------------|
   | `com.carrotsearch:hppc` | Hive lib provides; version skew → 401 |
   | `com.fasterxml.jackson.core:jackson-*` (2.17) | HS2 uses 2.16.x; plugin 
must not ship 2.17 copies |
   | `com.fasterxml.jackson.jaxrs:jackson-jaxrs-*` (2.17) | Same |
   | `com.fasterxml.jackson.module:jackson-module-jaxb-annotations` (2.17) | 
Replaced by pinned 2.16.1 |
   | `commons-collections:commons-collections` | Duplicate of Hive classpath |
   | `javax.annotation:javax.annotation-api` | Duplicate |
   | `joda-time:joda-time` | Duplicate |
   | `org.apache.httpcomponents:httpclient` | Hive/Hadoop lib provides aligned 
version |
   | `org.apache.httpcomponents:httpcore` / `httpcore-nio` | Same |
   
   #### Retained (audit + plugin runtime)
   
   | Category | Examples |
   |----------|----------|
   | Audit REST client | `jersey-client`, `jersey-common`, 
`jersey-media-json-jackson`, `jakarta.ws.rs-api` |
   | JAXB (Hive-aligned) | `jackson-module-jaxb-annotations:2.16.1` |
   | Other whitelisted | `httpasyncclient`, `httpmime`, `solr-solrj`, 
`jetty-client`, Graal/ICU, `hadoop-shaded-guava` |
   
   ### Why HDFS assembly is not changed
   
   `hdfs-agent.xml` still whitelists hppc/httpclient/Ranger Jackson; NameNode 
tolerates that layout. HiveServer2 does not — duplicate `hive-*` / Jackson / 
HTTP client JARs on the plugin classloader cause 401 (and can cause 
`ClassCastException` for `hive-*` APIs). Hive-specific trimming is intentional.
   
   ## Test plan
   
   - [ ] Rebuild hive plugin tarball from fixed assembly (full reactor or 
`distro` with `-P-all,ranger-hive-plugin`).
   - [ ] Verify `lib/ranger-hive-plugin-impl/` in tarball **does not** contain: 
`hppc`, `httpclient`, `httpcore`, `jackson-core-2.17*`, 
`commons-collections-3.2.2`, `hive-*`.
   - [ ] Verify tarball **does** contain: `jersey-client`, 
`ranger-audit-dest-auditserver`, `jackson-module-jaxb-annotations-2.16.1`.
   - [ ] Install/enable plugin on HiveServer2 (Kerberos + auditserver 
destination enabled).
   - [ ] Run a Hive query; confirm HS2 logs show no HTTP **401** on audit batch 
send to ingestor.
   - [ ] Confirm audits appear for the Hive service repository in the audit 
pipeline (ingestor / Solr / Admin Access as applicable).
   
   **Build example:**
   
   ```bash
   cd distro
   mvn install -P-all,ranger-hive-plugin -DskipTests -Drat.skip=true
   # Tarball: ../target/ranger-*-hive-plugin.tar.gz
   ```
   
   ### Scope note
   
   This packaging fix is **sufficient for RANGER-5646** in production: rebuild 
tarball → `enable-hive-plugin.sh` → restart HS2. Dev Docker harness scripts 
(runtime JAR alignment, audit URL wiring) are out of scope for this PR.
   
   Made with [Cursor](https://cursor.com)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to