adnanhemani commented on code in PR #4667: URL: https://github.com/apache/polaris/pull/4667#discussion_r3416670006
########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApiService.java: ########## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; + +/** + * Service interface implemented by the runtime to handle OpenLineage ingest. Mirrors the pattern + * used by other Polaris API modules where the JAX-RS resource sits in the API module and + * delegates to a CDI-scoped service implementation in {@code polaris-runtime-service}. Review Comment: Agreed — this should stay as the API/runtime delegation layer, not the provider contract. Will introduce `OpenLineageIngestProvider` behind it so this interface remains the HTTP/runtime seam and the provider seam is the extension point. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApiService.java: ########## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; + +/** + * Service interface implemented by the runtime to handle OpenLineage ingest. Mirrors the pattern + * used by other Polaris API modules where the JAX-RS resource sits in the API module and + * delegates to a CDI-scoped service implementation in {@code polaris-runtime-service}. + */ +public interface PolarisOpenLineageApiService { + + /** + * Handle an OpenLineage event accepted at the ingest endpoint. + * + * @param event the parsed OpenLineage event, dispatched to the correct {@code RunEvent}, {@code + * JobEvent}, or {@code DatasetEvent} variant by Jackson based on the {@code schemaURL} + * field. + * @return the JAX-RS response. OpenLineage clients expect {@code 201 Created} with no body on + * success. + */ + Response sendLineageEvent( Review Comment: Yes, this is the right split. Will add `OpenLineageIngestRequest` (typed OL event + realm id as a plain string) and `OpenLineageIngestResult` (accept/reject/unavailable). This service maps the request context to an `OpenLineageIngestRequest`, calls the provider, then maps the result back to `Response`. Provider implementations never see `SecurityContext` or `Response`. Keeping `OpenLineageIngestRequest` deliberately thin for now — no persistence-shaped fields since there's no persistence yet. ########## spec/openlineage-service.yaml: ########## @@ -0,0 +1,132 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + +# This file documents the OpenLineage ingest endpoint exposed by Apache Polaris. +# +# IMPORTANT: This spec is documentation, not a code-generation source. The +# OpenLineage event schema is a `oneOf` over RunEvent / JobEvent / DatasetEvent +# discriminated by the `schemaURL` field, which the OpenAPI Generator's Java +# templates cannot translate faithfully (they collapse the variants into a +# single class with every variant's required fields marked `@NotNull`, +# rejecting every valid event with a 400). Polaris instead hand-writes the +# JAX-RS resource on top of the official `io.openlineage:openlineage-java` +# library, exactly as Marquez (the OpenLineage reference server) does. See +# `api/openlineage-service/` for the implementation. +# +# The authoritative event schema lives upstream: +# https://openlineage.io/spec/2-0-2/OpenLineage.json +# +# This file is kept so the OpenLineage endpoint shows up alongside Polaris's +# other API specs and can be browsed by API tools. + +--- +openapi: 3.0.3 +info: + title: Apache Polaris OpenLineage Service + license: + name: Apache 2.0 + url: https://www.apache.org/licenses/LICENSE-2.0.html + version: 0.0.1 + description: + Defines the OpenLineage-compatible ingest API exposed by Apache Polaris. + Polaris accepts OpenLineage events at the standard OpenLineage path so + engines (Spark, Flink, Airflow, Trino, dbt) can be reconfigured by URL + alone. Event bodies follow the OpenLineage specification at + https://openlineage.io/spec/2-0-2/OpenLineage.json. + +servers: + - url: "{scheme}://{host}/api/v1" + description: Server URL when the port can be inferred from the scheme + variables: + scheme: + description: The scheme of the URI, either http or https. + default: https + host: + description: The host address for the specified server + default: localhost + +security: + - OAuth2: [] + - BearerAuth: [] + +paths: + /lineage: + post: + tags: + - OpenLineage API + summary: Submit an OpenLineage event + description: + Accepts an OpenLineage RunEvent, JobEvent, or DatasetEvent as defined + by the OpenLineage specification + (https://openlineage.io/spec/2-0-2/OpenLineage.json). + + + The body is dispatched to the correct event variant by inspecting the + `schemaURL` field — the fragment after the final `/` selects RunEvent, + JobEvent, or DatasetEvent. Unrecognized values fall back to RunEvent. + + + This endpoint is the standard OpenLineage ingest path, so existing + OpenLineage transports can target Polaris by URL change alone. + operationId: sendLineageEvent + requestBody: + description: + An OpenLineage event. The exact body schema is the upstream + OpenLineage 2-0-2 JSON Schema referenced above. + required: true + content: + application/json: + schema: + type: object + additionalProperties: true + externalDocs: + description: OpenLineage 2-0-2 JSON Schema (authoritative) + url: https://openlineage.io/spec/2-0-2/OpenLineage.json + responses: + 201: + description: + Created. The event was accepted for processing. Per the + OpenLineage server specification the response body is empty. + 400: + description: Bad Request - The request body is not a valid OpenLineage event + 401: + description: Unauthorized - The caller is not authenticated + 403: + description: Forbidden - The caller is not authorized to submit lineage events + 503: + description: Service Unavailable - The lineage subsystem is temporarily unable to accept events + 5XX: + description: Server error + +components: + securitySchemes: + OAuth2: + type: oauth2 + description: + OAuth2 client-credentials flow against the Polaris token endpoint. + The same client-id/secret used for catalog access is used here; only + an additional LINEAGE_INGEST privilege grant is required. Review Comment: Correct — that privilege language is ahead of the implementation. Will mark it as future work with a `# TODO` comment so it reads as planned behavior, not current contract. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/LineageEventTypeResolver.java: ########## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import com.fasterxml.jackson.annotation.JsonTypeInfo; +import com.fasterxml.jackson.databind.DatabindContext; +import com.fasterxml.jackson.databind.JavaType; +import com.fasterxml.jackson.databind.jsontype.impl.TypeIdResolverBase; + +/** + * Resolves an OpenLineage event JSON body to one of {@link PolarisLineageEvent.OfRunEvent}, {@link + * PolarisLineageEvent.OfJobEvent}, or {@link PolarisLineageEvent.OfDatasetEvent} by inspecting the + * trailing path segment of the {@code schemaURL} field. + * + * <p>The OpenLineage spec requires every event to include a {@code schemaURL} pointing at the + * variant's JSON schema fragment, e.g. {@code + * https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunEvent}. The fragment name is the + * stable discriminator across spec versions. + * + * <p>If {@code schemaURL} is missing or unrecognized, the body is parsed as a {@code RunEvent} — Review Comment: Agreed. The `schemaURL` fallback is a parsing/wire-compatibility concern and belongs here in the adapter layer. `OpenLineageIngestRequest` will carry the already-dispatched typed event, so provider implementations never need to understand `schemaURL` semantics. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApi.java: ########## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import io.micrometer.core.annotation.Timed; +import io.micrometer.core.aop.MeterTag; +import jakarta.annotation.security.RolesAllowed; +import jakarta.inject.Inject; +import jakarta.validation.Valid; +import jakarta.validation.constraints.NotNull; +import jakarta.ws.rs.Consumes; +import jakarta.ws.rs.POST; +import jakarta.ws.rs.Path; +import jakarta.ws.rs.core.Context; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.eclipse.microprofile.faulttolerance.Timeout; + +/** + * JAX-RS resource for the OpenLineage ingest endpoint. + * + * <p>Mounted at the standard OpenLineage path ({@code POST /api/v1/lineage}) so that any engine + * already using the OpenLineage HTTP transport (Spark, Flink, Airflow, Trino, dbt) can target + * Polaris by URL change alone — no client-side rewriting. + * + * <p>The body is parsed into a {@link PolarisLineageEvent} which Jackson resolves to one of {@code + * OfRunEvent} / {@code OfJobEvent} / {@code OfDatasetEvent} based on the {@code schemaURL} field. + * Unrecognized {@code schemaURL} values fall back to {@code RunEvent}, matching Marquez behavior. + * + * <p>This resource is hand-written rather than generated from the OpenLineage JSON Schema because + * the OpenAPI generator's Java template does not faithfully translate the spec's {@code oneOf} + * over event variants. + */ +@Path("/api/v1/lineage") Review Comment: The path `/api/v1/lineage` is the standard OpenLineage HTTP transport endpoint — it's what Marquez serves and what Spark/Flink/Airflow/dbt/Trino target when you set `transport.type=http`. The PR's core value proposition is that engines can point at Polaris with a URL change alone; a non-standard path defeats that. That said, the concern about this path becoming the default interpretation of "Polaris lineage" is real. With the `OpenLineageIngestProvider` layer the internals are unambiguously OL-specific regardless of the path, but the external path is still a one-way door. I'll make the OL-specificity explicit in the Javadoc and YAML spec. If there's appetite for a broader namespace discussion I'm happy to raise it on the dev list — though I'd want to be careful not to break wire compatibility in the process. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApi.java: ########## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import io.micrometer.core.annotation.Timed; +import io.micrometer.core.aop.MeterTag; +import jakarta.annotation.security.RolesAllowed; +import jakarta.inject.Inject; +import jakarta.validation.Valid; +import jakarta.validation.constraints.NotNull; +import jakarta.ws.rs.Consumes; +import jakarta.ws.rs.POST; +import jakarta.ws.rs.Path; +import jakarta.ws.rs.core.Context; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.eclipse.microprofile.faulttolerance.Timeout; + +/** + * JAX-RS resource for the OpenLineage ingest endpoint. + * + * <p>Mounted at the standard OpenLineage path ({@code POST /api/v1/lineage}) so that any engine + * already using the OpenLineage HTTP transport (Spark, Flink, Airflow, Trino, dbt) can target + * Polaris by URL change alone — no client-side rewriting. + * + * <p>The body is parsed into a {@link PolarisLineageEvent} which Jackson resolves to one of {@code + * OfRunEvent} / {@code OfJobEvent} / {@code OfDatasetEvent} based on the {@code schemaURL} field. + * Unrecognized {@code schemaURL} values fall back to {@code RunEvent}, matching Marquez behavior. + * + * <p>This resource is hand-written rather than generated from the OpenLineage JSON Schema because + * the OpenAPI generator's Java template does not faithfully translate the spec's {@code oneOf} + * over event variants. + */ +@Path("/api/v1/lineage") +public class PolarisOpenLineageApi { + + private final PolarisOpenLineageApiService service; + + @Inject + public PolarisOpenLineageApi(PolarisOpenLineageApiService service) { + this.service = service; + } + + @POST + @Consumes("application/json") + @RolesAllowed("**") + @Timed("polaris.OpenLineageApi.sendLineageEvent") + @Timeout + public Response sendLineageEvent( + @NotNull @Valid PolarisLineageEvent event, Review Comment: With the `OpenLineageIngestProvider` seam, `PolarisLineageEvent` stays inside the resource + adapter and never crosses the provider boundary. Follow-up implementations only see `OpenLineageIngestRequest`, which Polaris controls and can evolve independently from the JAX-RS binding or the OpenLineage-Java model. ########## runtime/service/src/main/java/org/apache/polaris/service/lineage/OpenLineageAdapter.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage; + +import jakarta.enterprise.context.RequestScoped; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.apache.polaris.service.lineage.api.PolarisLineageEvent; +import org.apache.polaris.service.lineage.api.PolarisOpenLineageApiService; + +/** + * No-op implementation of the OpenLineage ingest endpoint. + * + * <p>Accepts and discards events. Persistence, dataset resolution, and downstream forwarding will + * land in follow-up PRs as described in the Polaris OpenLineage proposal. + */ +@RequestScoped +public class OpenLineageAdapter implements PolarisOpenLineageApiService { Review Comment: Agreed — will inject `OpenLineageIngestProvider` here and have the adapter call it. `NoOpOpenLineageIngestProvider` becomes the default CDI bean. Future persistence/forwarding PRs swap in a real provider without touching this adapter. ########## runtime/service/src/main/java/org/apache/polaris/service/lineage/OpenLineageAdapter.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage; + +import jakarta.enterprise.context.RequestScoped; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.apache.polaris.service.lineage.api.PolarisLineageEvent; +import org.apache.polaris.service.lineage.api.PolarisOpenLineageApiService; + +/** + * No-op implementation of the OpenLineage ingest endpoint. + * + * <p>Accepts and discards events. Persistence, dataset resolution, and downstream forwarding will + * land in follow-up PRs as described in the Polaris OpenLineage proposal. + */ +@RequestScoped +public class OpenLineageAdapter implements PolarisOpenLineageApiService { + + @Override + public Response sendLineageEvent( + PolarisLineageEvent event, RealmContext realmContext, SecurityContext securityContext) { + return Response.status(Response.Status.CREATED).build(); Review Comment: Yes — will change this to `OpenLineageIngestResult result = provider.ingest(request); return toResponse(result);`. The no-op provider returns accepted, this adapter maps it to `201`. Accept/reject stays in the provider; HTTP semantics stay here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
