adnanhemani commented on code in PR #4667:
URL: https://github.com/apache/polaris/pull/4667#discussion_r3416670006


##########
api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApiService.java:
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.lineage.api;
+
+import jakarta.ws.rs.core.Response;
+import jakarta.ws.rs.core.SecurityContext;
+import org.apache.polaris.core.context.RealmContext;
+
+/**
+ * Service interface implemented by the runtime to handle OpenLineage ingest. 
Mirrors the pattern
+ * used by other Polaris API modules where the JAX-RS resource sits in the API 
module and
+ * delegates to a CDI-scoped service implementation in {@code 
polaris-runtime-service}.

Review Comment:
   Agreed — this should stay as the API/runtime delegation layer, not the 
provider contract. Will introduce `OpenLineageIngestProvider` behind it so this 
interface remains the HTTP/runtime seam and the provider seam is the extension 
point.



##########
api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApiService.java:
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.lineage.api;
+
+import jakarta.ws.rs.core.Response;
+import jakarta.ws.rs.core.SecurityContext;
+import org.apache.polaris.core.context.RealmContext;
+
+/**
+ * Service interface implemented by the runtime to handle OpenLineage ingest. 
Mirrors the pattern
+ * used by other Polaris API modules where the JAX-RS resource sits in the API 
module and
+ * delegates to a CDI-scoped service implementation in {@code 
polaris-runtime-service}.
+ */
+public interface PolarisOpenLineageApiService {
+
+  /**
+   * Handle an OpenLineage event accepted at the ingest endpoint.
+   *
+   * @param event the parsed OpenLineage event, dispatched to the correct 
{@code RunEvent}, {@code
+   *     JobEvent}, or {@code DatasetEvent} variant by Jackson based on the 
{@code schemaURL}
+   *     field.
+   * @return the JAX-RS response. OpenLineage clients expect {@code 201 
Created} with no body on
+   *     success.
+   */
+  Response sendLineageEvent(

Review Comment:
   Yes, this is the right split. Will add `OpenLineageIngestRequest` (typed OL 
event + realm id as a plain string) and `OpenLineageIngestResult` 
(accept/reject/unavailable). This service maps the request context to an 
`OpenLineageIngestRequest`, calls the provider, then maps the result back to 
`Response`. Provider implementations never see `SecurityContext` or `Response`.
   
   Keeping `OpenLineageIngestRequest` deliberately thin for now — no 
persistence-shaped fields since there's no persistence yet.



##########
spec/openlineage-service.yaml:
##########
@@ -0,0 +1,132 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# This file documents the OpenLineage ingest endpoint exposed by Apache 
Polaris.
+#
+# IMPORTANT: This spec is documentation, not a code-generation source. The
+# OpenLineage event schema is a `oneOf` over RunEvent / JobEvent / DatasetEvent
+# discriminated by the `schemaURL` field, which the OpenAPI Generator's Java
+# templates cannot translate faithfully (they collapse the variants into a
+# single class with every variant's required fields marked `@NotNull`,
+# rejecting every valid event with a 400). Polaris instead hand-writes the
+# JAX-RS resource on top of the official `io.openlineage:openlineage-java`
+# library, exactly as Marquez (the OpenLineage reference server) does. See
+# `api/openlineage-service/` for the implementation.
+#
+# The authoritative event schema lives upstream:
+#   https://openlineage.io/spec/2-0-2/OpenLineage.json
+#
+# This file is kept so the OpenLineage endpoint shows up alongside Polaris's
+# other API specs and can be browsed by API tools.
+
+---
+openapi: 3.0.3
+info:
+  title: Apache Polaris OpenLineage Service
+  license:
+    name: Apache 2.0
+    url: https://www.apache.org/licenses/LICENSE-2.0.html
+  version: 0.0.1
+  description:
+    Defines the OpenLineage-compatible ingest API exposed by Apache Polaris.
+    Polaris accepts OpenLineage events at the standard OpenLineage path so
+    engines (Spark, Flink, Airflow, Trino, dbt) can be reconfigured by URL
+    alone. Event bodies follow the OpenLineage specification at
+    https://openlineage.io/spec/2-0-2/OpenLineage.json.
+
+servers:
+  - url: "{scheme}://{host}/api/v1"
+    description: Server URL when the port can be inferred from the scheme
+    variables:
+      scheme:
+        description: The scheme of the URI, either http or https.
+        default: https
+      host:
+        description: The host address for the specified server
+        default: localhost
+
+security:
+  - OAuth2: []
+  - BearerAuth: []
+
+paths:
+  /lineage:
+    post:
+      tags:
+        - OpenLineage API
+      summary: Submit an OpenLineage event
+      description:
+        Accepts an OpenLineage RunEvent, JobEvent, or DatasetEvent as defined
+        by the OpenLineage specification
+        (https://openlineage.io/spec/2-0-2/OpenLineage.json).
+
+
+        The body is dispatched to the correct event variant by inspecting the
+        `schemaURL` field — the fragment after the final `/` selects RunEvent,
+        JobEvent, or DatasetEvent. Unrecognized values fall back to RunEvent.
+
+
+        This endpoint is the standard OpenLineage ingest path, so existing
+        OpenLineage transports can target Polaris by URL change alone.
+      operationId: sendLineageEvent
+      requestBody:
+        description:
+          An OpenLineage event. The exact body schema is the upstream
+          OpenLineage 2-0-2 JSON Schema referenced above.
+        required: true
+        content:
+          application/json:
+            schema:
+              type: object
+              additionalProperties: true
+              externalDocs:
+                description: OpenLineage 2-0-2 JSON Schema (authoritative)
+                url: https://openlineage.io/spec/2-0-2/OpenLineage.json
+      responses:
+        201:
+          description:
+            Created. The event was accepted for processing. Per the
+            OpenLineage server specification the response body is empty.
+        400:
+          description: Bad Request - The request body is not a valid 
OpenLineage event
+        401:
+          description: Unauthorized - The caller is not authenticated
+        403:
+          description: Forbidden - The caller is not authorized to submit 
lineage events
+        503:
+          description: Service Unavailable - The lineage subsystem is 
temporarily unable to accept events
+        5XX:
+          description: Server error
+
+components:
+  securitySchemes:
+    OAuth2:
+      type: oauth2
+      description:
+        OAuth2 client-credentials flow against the Polaris token endpoint.
+        The same client-id/secret used for catalog access is used here; only
+        an additional LINEAGE_INGEST privilege grant is required.

Review Comment:
   Correct — that privilege language is ahead of the implementation. Will mark 
it as future work with a `# TODO` comment so it reads as planned behavior, not 
current contract.



##########
api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/LineageEventTypeResolver.java:
##########
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.lineage.api;
+
+import com.fasterxml.jackson.annotation.JsonTypeInfo;
+import com.fasterxml.jackson.databind.DatabindContext;
+import com.fasterxml.jackson.databind.JavaType;
+import com.fasterxml.jackson.databind.jsontype.impl.TypeIdResolverBase;
+
+/**
+ * Resolves an OpenLineage event JSON body to one of {@link 
PolarisLineageEvent.OfRunEvent}, {@link
+ * PolarisLineageEvent.OfJobEvent}, or {@link 
PolarisLineageEvent.OfDatasetEvent} by inspecting the
+ * trailing path segment of the {@code schemaURL} field.
+ *
+ * <p>The OpenLineage spec requires every event to include a {@code schemaURL} 
pointing at the
+ * variant's JSON schema fragment, e.g. {@code
+ * https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunEvent}. The 
fragment name is the
+ * stable discriminator across spec versions.
+ *
+ * <p>If {@code schemaURL} is missing or unrecognized, the body is parsed as a 
{@code RunEvent} —

Review Comment:
   Agreed. The `schemaURL` fallback is a parsing/wire-compatibility concern and 
belongs here in the adapter layer. `OpenLineageIngestRequest` will carry the 
already-dispatched typed event, so provider implementations never need to 
understand `schemaURL` semantics.



##########
api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApi.java:
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.lineage.api;
+
+import io.micrometer.core.annotation.Timed;
+import io.micrometer.core.aop.MeterTag;
+import jakarta.annotation.security.RolesAllowed;
+import jakarta.inject.Inject;
+import jakarta.validation.Valid;
+import jakarta.validation.constraints.NotNull;
+import jakarta.ws.rs.Consumes;
+import jakarta.ws.rs.POST;
+import jakarta.ws.rs.Path;
+import jakarta.ws.rs.core.Context;
+import jakarta.ws.rs.core.Response;
+import jakarta.ws.rs.core.SecurityContext;
+import org.apache.polaris.core.context.RealmContext;
+import org.eclipse.microprofile.faulttolerance.Timeout;
+
+/**
+ * JAX-RS resource for the OpenLineage ingest endpoint.
+ *
+ * <p>Mounted at the standard OpenLineage path ({@code POST /api/v1/lineage}) 
so that any engine
+ * already using the OpenLineage HTTP transport (Spark, Flink, Airflow, Trino, 
dbt) can target
+ * Polaris by URL change alone — no client-side rewriting.
+ *
+ * <p>The body is parsed into a {@link PolarisLineageEvent} which Jackson 
resolves to one of {@code
+ * OfRunEvent} / {@code OfJobEvent} / {@code OfDatasetEvent} based on the 
{@code schemaURL} field.
+ * Unrecognized {@code schemaURL} values fall back to {@code RunEvent}, 
matching Marquez behavior.
+ *
+ * <p>This resource is hand-written rather than generated from the OpenLineage 
JSON Schema because
+ * the OpenAPI generator's Java template does not faithfully translate the 
spec's {@code oneOf}
+ * over event variants.
+ */
+@Path("/api/v1/lineage")

Review Comment:
   The path `/api/v1/lineage` is the standard OpenLineage HTTP transport 
endpoint — it's what Marquez serves and what Spark/Flink/Airflow/dbt/Trino 
target when you set `transport.type=http`. The PR's core value proposition is 
that engines can point at Polaris with a URL change alone; a non-standard path 
defeats that.
   
   That said, the concern about this path becoming the default interpretation 
of "Polaris lineage" is real. With the `OpenLineageIngestProvider` layer the 
internals are unambiguously OL-specific regardless of the path, but the 
external path is still a one-way door. I'll make the OL-specificity explicit in 
the Javadoc and YAML spec. If there's appetite for a broader namespace 
discussion I'm happy to raise it on the dev list — though I'd want to be 
careful not to break wire compatibility in the process.



##########
api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApi.java:
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.lineage.api;
+
+import io.micrometer.core.annotation.Timed;
+import io.micrometer.core.aop.MeterTag;
+import jakarta.annotation.security.RolesAllowed;
+import jakarta.inject.Inject;
+import jakarta.validation.Valid;
+import jakarta.validation.constraints.NotNull;
+import jakarta.ws.rs.Consumes;
+import jakarta.ws.rs.POST;
+import jakarta.ws.rs.Path;
+import jakarta.ws.rs.core.Context;
+import jakarta.ws.rs.core.Response;
+import jakarta.ws.rs.core.SecurityContext;
+import org.apache.polaris.core.context.RealmContext;
+import org.eclipse.microprofile.faulttolerance.Timeout;
+
+/**
+ * JAX-RS resource for the OpenLineage ingest endpoint.
+ *
+ * <p>Mounted at the standard OpenLineage path ({@code POST /api/v1/lineage}) 
so that any engine
+ * already using the OpenLineage HTTP transport (Spark, Flink, Airflow, Trino, 
dbt) can target
+ * Polaris by URL change alone — no client-side rewriting.
+ *
+ * <p>The body is parsed into a {@link PolarisLineageEvent} which Jackson 
resolves to one of {@code
+ * OfRunEvent} / {@code OfJobEvent} / {@code OfDatasetEvent} based on the 
{@code schemaURL} field.
+ * Unrecognized {@code schemaURL} values fall back to {@code RunEvent}, 
matching Marquez behavior.
+ *
+ * <p>This resource is hand-written rather than generated from the OpenLineage 
JSON Schema because
+ * the OpenAPI generator's Java template does not faithfully translate the 
spec's {@code oneOf}
+ * over event variants.
+ */
+@Path("/api/v1/lineage")
+public class PolarisOpenLineageApi {
+
+  private final PolarisOpenLineageApiService service;
+
+  @Inject
+  public PolarisOpenLineageApi(PolarisOpenLineageApiService service) {
+    this.service = service;
+  }
+
+  @POST
+  @Consumes("application/json")
+  @RolesAllowed("**")
+  @Timed("polaris.OpenLineageApi.sendLineageEvent")
+  @Timeout
+  public Response sendLineageEvent(
+      @NotNull @Valid PolarisLineageEvent event,

Review Comment:
   With the `OpenLineageIngestProvider` seam, `PolarisLineageEvent` stays 
inside the resource + adapter and never crosses the provider boundary. 
Follow-up implementations only see `OpenLineageIngestRequest`, which Polaris 
controls and can evolve independently from the JAX-RS binding or the 
OpenLineage-Java model.



##########
runtime/service/src/main/java/org/apache/polaris/service/lineage/OpenLineageAdapter.java:
##########
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.lineage;
+
+import jakarta.enterprise.context.RequestScoped;
+import jakarta.ws.rs.core.Response;
+import jakarta.ws.rs.core.SecurityContext;
+import org.apache.polaris.core.context.RealmContext;
+import org.apache.polaris.service.lineage.api.PolarisLineageEvent;
+import org.apache.polaris.service.lineage.api.PolarisOpenLineageApiService;
+
+/**
+ * No-op implementation of the OpenLineage ingest endpoint.
+ *
+ * <p>Accepts and discards events. Persistence, dataset resolution, and 
downstream forwarding will
+ * land in follow-up PRs as described in the Polaris OpenLineage proposal.
+ */
+@RequestScoped
+public class OpenLineageAdapter implements PolarisOpenLineageApiService {

Review Comment:
   Agreed — will inject `OpenLineageIngestProvider` here and have the adapter 
call it. `NoOpOpenLineageIngestProvider` becomes the default CDI bean. Future 
persistence/forwarding PRs swap in a real provider without touching this 
adapter.



##########
runtime/service/src/main/java/org/apache/polaris/service/lineage/OpenLineageAdapter.java:
##########
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.lineage;
+
+import jakarta.enterprise.context.RequestScoped;
+import jakarta.ws.rs.core.Response;
+import jakarta.ws.rs.core.SecurityContext;
+import org.apache.polaris.core.context.RealmContext;
+import org.apache.polaris.service.lineage.api.PolarisLineageEvent;
+import org.apache.polaris.service.lineage.api.PolarisOpenLineageApiService;
+
+/**
+ * No-op implementation of the OpenLineage ingest endpoint.
+ *
+ * <p>Accepts and discards events. Persistence, dataset resolution, and 
downstream forwarding will
+ * land in follow-up PRs as described in the Polaris OpenLineage proposal.
+ */
+@RequestScoped
+public class OpenLineageAdapter implements PolarisOpenLineageApiService {
+
+  @Override
+  public Response sendLineageEvent(
+      PolarisLineageEvent event, RealmContext realmContext, SecurityContext 
securityContext) {
+    return Response.status(Response.Status.CREATED).build();

Review Comment:
   Yes — will change this to `OpenLineageIngestResult result = 
provider.ingest(request); return toResponse(result);`. The no-op provider 
returns accepted, this adapter maps it to `201`. Accept/reject stays in the 
provider; HTTP semantics stay here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to