adnanhemani opened a new pull request, #4667: URL: https://github.com/apache/polaris/pull/4667
<!-- ๐ Describe what changes you're proposing, especially breaking or user-facing changes. ๐ See https://github.com/apache/polaris/blob/main/CONTRIBUTING.md for more. --> ## Summary Adds the OpenLineage-compatible ingest endpoint defined in the Polaris OpenLineage proposal. This first PR mounts the route and accepts events; persistence, dataset resolution, and downstream forwarding are follow-up PRs. The endpoint is mounted at the **standard OpenLineage path** (`POST /api/v1/lineage`) so any engine using the OpenLineage HTTP transport (Spark, Flink, Airflow, Trino, dbt) can target Polaris by URL change alone โ no client-side rewriting required. ### Why hand-written instead of OpenAPI codegen Body parsing follows the [Marquez](https://github.com/MarquezProject/marquez) (OpenLineage reference server) pattern: a hand-written JAX-RS resource on top of `io.openlineage:openlineage-java`, with Jackson polymorphism keyed on the `schemaURL` field to dispatch between `RunEvent` / `JobEvent` / `DatasetEvent`. The OpenAPI Generator's Java template cannot translate the OpenLineage spec's `oneOf` faithfully โ it collapses the variants into a single class with every variant's required fields marked `@NotNull`, rejecting every valid event with a 400. Codegen is therefore intentionally skipped for this module; `spec/openlineage-service.yaml` is kept as documentation only. ### Wrapper hierarchy `PolarisLineageEvent` is a sealed base with three permitted subclasses (`OfRunEvent`, `OfJobEvent`, `OfDatasetEvent`). Each wrapper holds the official `io.openlineage.server.OpenLineage.{Run,Job,Dataset}Event` by composition (those classes are `final`). A custom `JsonTypeIdResolver` reads the trailing path segment of `schemaURL` (e.g. `โฆ/RunEvent`) to dispatch. Unknown or missing `schemaURL` falls back to `RunEvent`, matching Marquez behavior. The next PR (persistence/forwarding/query) receives `PolarisLineageEvent.event()` already typed as the correct OL event โ no JSON re-parsing needed. ### Files - `spec/openlineage-service.yaml` โ documentation-only spec; header explains why it isn't a codegen source. - `api/openlineage-service/` โ new Gradle module: `PolarisOpenLineageApi` (JAX-RS resource), `PolarisLineageEvent` (sealed wrapper hierarchy), `LineageEventTypeResolver` (Jackson dispatch), `PolarisOpenLineageApiService` (service interface). - `runtime/service/.../lineage/OpenLineageAdapter.java` โ `@RequestScoped` no-op CDI bean returning `201`. - `gradle/libs.versions.toml` โ adds `openlineage-java = "1.48.0"`. - `gradle/projects.main.properties`, `runtime/service/build.gradle.kts` โ wire the new module into the build. ### Manual verification End-to-end against a running server: | Request | Result | |---|---| | `RunEvent` with full body | `201` | | `JobEvent` | `201` | | `DatasetEvent` | `201` | | Unknown `schemaURL` (RunEvent body) | `201` (falls back to RunEvent) | | Missing `schemaURL` | `201` (falls back to RunEvent) | | Empty `{}` body | `201` | | No `Authorization` header | `401` | Standalone Jackson dispatch test (proves polymorphism, not just HTTP success): ``` RunEvent -> wrapper=OfRunEvent, event=RunEvent JobEvent -> wrapper=OfJobEvent, event=JobEvent DatasetEvent -> wrapper=OfDatasetEvent, event=DatasetEvent unknown->Run -> wrapper=OfRunEvent, event=RunEvent missing->Run -> wrapper=OfRunEvent, event=RunEvent ``` ## Checklist - [x] ๐ก๏ธ Don't disclose security issues! (contact [email protected]) - [x] ๐ Clearly explained why the changes are needed, or linked related issues: tracked in the [Polaris OpenLineage proposal](https://docs.google.com/document/d/1iOzIuFW66SFL2wZOADD9knMTG21OwY7VmaWVSvMUqQk/) - [x] ๐งช Added/updated tests with good coverage, or manually tested (and explained how) โ manual end-to-end + Jackson dispatch test documented above; automated tests will land with the persistence PR - [x] ๐ก Added comments for complex logic โ wrapper hierarchy and resolver carry rationale comments explaining the Marquez approach and the codegen avoidance - [ ] ๐งพ Updated `CHANGELOG.md` (if needed) โ N/A for a no-op endpoint with no user-facing behavior; will update when persistence/forwarding lands - [ ] ๐ Updated documentation in `site/content/in-dev/unreleased` (if needed) โ N/A for the same reason -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
