adnanhemani opened a new pull request, #4667:
URL: https://github.com/apache/polaris/pull/4667

   <!--
   ๐Ÿ“ Describe what changes you're proposing, especially breaking or user-facing 
changes.
   ๐Ÿ“– See https://github.com/apache/polaris/blob/main/CONTRIBUTING.md for more.
   -->
   
   ## Summary
   
   Adds the OpenLineage-compatible ingest endpoint defined in the Polaris 
OpenLineage proposal. This first PR mounts the route and accepts events; 
persistence, dataset resolution, and downstream forwarding are follow-up PRs.
   
   The endpoint is mounted at the **standard OpenLineage path** (`POST 
/api/v1/lineage`) so any engine using the OpenLineage HTTP transport (Spark, 
Flink, Airflow, Trino, dbt) can target Polaris by URL change alone โ€” no 
client-side rewriting required.
   
   ### Why hand-written instead of OpenAPI codegen
   
   Body parsing follows the 
[Marquez](https://github.com/MarquezProject/marquez) (OpenLineage reference 
server) pattern: a hand-written JAX-RS resource on top of 
`io.openlineage:openlineage-java`, with Jackson polymorphism keyed on the 
`schemaURL` field to dispatch between `RunEvent` / `JobEvent` / `DatasetEvent`.
   
   The OpenAPI Generator's Java template cannot translate the OpenLineage 
spec's `oneOf` faithfully โ€” it collapses the variants into a single class with 
every variant's required fields marked `@NotNull`, rejecting every valid event 
with a 400. Codegen is therefore intentionally skipped for this module; 
`spec/openlineage-service.yaml` is kept as documentation only.
   
   ### Wrapper hierarchy
   
   `PolarisLineageEvent` is a sealed base with three permitted subclasses 
(`OfRunEvent`, `OfJobEvent`, `OfDatasetEvent`). Each wrapper holds the official 
`io.openlineage.server.OpenLineage.{Run,Job,Dataset}Event` by composition 
(those classes are `final`). A custom `JsonTypeIdResolver` reads the trailing 
path segment of `schemaURL` (e.g. `โ€ฆ/RunEvent`) to dispatch. Unknown or missing 
`schemaURL` falls back to `RunEvent`, matching Marquez behavior.
   
   The next PR (persistence/forwarding/query) receives 
`PolarisLineageEvent.event()` already typed as the correct OL event โ€” no JSON 
re-parsing needed.
   
   ### Files
   
   - `spec/openlineage-service.yaml` โ€” documentation-only spec; header explains 
why it isn't a codegen source.
   - `api/openlineage-service/` โ€” new Gradle module: `PolarisOpenLineageApi` 
(JAX-RS resource), `PolarisLineageEvent` (sealed wrapper hierarchy), 
`LineageEventTypeResolver` (Jackson dispatch), `PolarisOpenLineageApiService` 
(service interface).
   - `runtime/service/.../lineage/OpenLineageAdapter.java` โ€” `@RequestScoped` 
no-op CDI bean returning `201`.
   - `gradle/libs.versions.toml` โ€” adds `openlineage-java = "1.48.0"`.
   - `gradle/projects.main.properties`, `runtime/service/build.gradle.kts` โ€” 
wire the new module into the build.
   
   ### Manual verification
   
   End-to-end against a running server:
   
   | Request | Result |
   |---|---|
   | `RunEvent` with full body | `201` |
   | `JobEvent` | `201` |
   | `DatasetEvent` | `201` |
   | Unknown `schemaURL` (RunEvent body) | `201` (falls back to RunEvent) |
   | Missing `schemaURL` | `201` (falls back to RunEvent) |
   | Empty `{}` body | `201` |
   | No `Authorization` header | `401` |
   
   Standalone Jackson dispatch test (proves polymorphism, not just HTTP 
success):
   
   ```
   RunEvent     -> wrapper=OfRunEvent,    event=RunEvent
   JobEvent     -> wrapper=OfJobEvent,    event=JobEvent
   DatasetEvent -> wrapper=OfDatasetEvent, event=DatasetEvent
   unknown->Run -> wrapper=OfRunEvent,    event=RunEvent
   missing->Run -> wrapper=OfRunEvent,    event=RunEvent
   ```
   
   ## Checklist
   - [x] ๐Ÿ›ก๏ธ Don't disclose security issues! (contact [email protected])
   - [x] ๐Ÿ”— Clearly explained why the changes are needed, or linked related 
issues: tracked in the [Polaris OpenLineage 
proposal](https://docs.google.com/document/d/1iOzIuFW66SFL2wZOADD9knMTG21OwY7VmaWVSvMUqQk/)
   - [x] ๐Ÿงช Added/updated tests with good coverage, or manually tested (and 
explained how) โ€” manual end-to-end + Jackson dispatch test documented above; 
automated tests will land with the persistence PR
   - [x] ๐Ÿ’ก Added comments for complex logic โ€” wrapper hierarchy and resolver 
carry rationale comments explaining the Marquez approach and the codegen 
avoidance
   - [ ] ๐Ÿงพ Updated `CHANGELOG.md` (if needed) โ€” N/A for a no-op endpoint with 
no user-facing behavior; will update when persistence/forwarding lands
   - [ ] ๐Ÿ“š Updated documentation in `site/content/in-dev/unreleased` (if 
needed) โ€” N/A for the same reason


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to