ramackri opened a new pull request, #690:
URL: https://github.com/apache/atlas/pull/690

   ## Summary
   
   Fix Trino Extractor standalone tarball failures when importing `trino_*` 
metadata into Atlas via `AtlasClientV2`. The bug has existed since ATLAS-5021 
(PR #428, Sep 2025) and is unrelated to the Kafka 3.9.1 upgrade.
   
   ## Root cause
   
   1. **`jersey-client` 1.9 pin** in `addons/trino-extractor/pom.xml` 
conflicted with `atlas-client-v2` (1.19) → duplicate Jersey jars in distro 
`lib/`.
   2. **Entity POST APIs** passed Java objects to Jersey; works on full server 
/ curated bridge classpaths via POJO mapping, fails in minimal extractor 
tarball (`MessageBodyWriter` not found for 
`AtlasEntity$AtlasEntityWithExtInfo`).
   3. **Entity GET APIs** could not deserialize model responses via Jersey POJO 
mapping for inner classes in standalone `lib/` layout.
   4. **`AuthenticationUtil`** only read credentials from `System.console()` → 
`401 Unauthorized` in non-interactive runs.
   
   Type-def APIs already used `AtlasType.toJson()`; entity APIs did not — that 
asymmetry is why type defs could work while entity POSTs failed.
   
   ## Changes
   
   | File | Change |
   |------|--------|
   | `addons/trino-extractor/pom.xml` | Remove explicit `jersey-client` 1.9 
pin; inherit `jersey.version` 1.19 from parent |
   | `client/client-v2/.../AtlasClientV2.java` | Entity mutation APIs send JSON 
via `AtlasType.toJson()` — `createEntity`, `createEntities`, `updateEntity`, 
`updateEntities`, `updateEntityByAttribute` |
   | `client/common/.../AtlasBaseClient.java` | For `org.apache.atlas.model.*` 
response types, read body as String and parse with `AtlasJson.fromJson()` |
   | `intg/.../AuthenticationUtil.java` | Support `ATLAS_USERNAME` / 
`ATLAS_PASSWORD` env vars before console prompt |
   
   **Note on `AtlasClientV2`:** Existing bridges and webapp ITs used 
object-passing successfully where Jersey POJO mapping had a clean classpath. 
Pre-serializing with `AtlasType.toJson()` produces the same wire JSON and 
aligns entity APIs with type-def APIs — no behaviour change for working callers.
   
   ## Testing
   
   ### Build
   
   ```bash
   mvn -pl addons/trino-extractor,client/client-v2,client/common,intg -am 
package -DskipTests
   mvn -pl addons/trino-extractor,distro -am package -DskipTests -Pdist
   ```
   
   Confirm tarball ships only `jersey-client-1.19.jar` (not 1.9).
   
   ### Manual — Trino extractor smoke test
   
   1. Extract `apache-atlas-*-trino-extractor.tar.gz`; configure 
`atlas.rest.address`, Trino JDBC URL, namespace, and catalog in 
`atlas-trino-extractor.properties`.
   2. Run extractor for a single Hive-backed table with `ATLAS_USERNAME` / 
`ATLAS_PASSWORD` set (no TTY required).
   3. Confirm `trino_column` entity present via Atlas REST unique-attribute 
lookup (e.g. `qualifiedName=hive.hr.trino_pii_hive_v2.ssn@dev`).
   
   **Result:** **PASS** — no `MessageBodyWriter` / `MessageBodyReader` errors; 
`trino_*` entities imported successfully.
   
   ### Manual — Trino tag-auth E2E
   
   Full path verified end-to-end:
   
   | Step | What was verified | Result |
   |------|-------------------|--------|
   | Metadata import | Extractor imports `trino_*` entities (not manual REST 
fallback) | **PASS** |
   | Classification | PII tag applied on `trino_column` via Atlas REST | 
**PASS** |
   | TagSync → Ranger | Tag mapping propagated to Trino service in Ranger Admin 
| **PASS** |
   | Tag enforcement | Admin sees raw value; denied user blocked; masked user 
sees masked SSN | **PASS** |
   | Audit | Trino access audit entries recorded in Ranger | **PASS** |
   
   **Overall:** **PASS** — extractor-first metadata path; tag-based deny/mask 
enforced on Trino queries.
   
   ## Related
   
   - ATLAS-5021 — original Trino Extractor feature (PR #428)
   - Not caused by Kafka 3.9.1 (Dependabot commit `6709f6459` only changed 
`<kafka.version>`)
   
   https://issues.apache.org/jira/browse/ATLAS-5337
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to