adnanhemani commented on code in PR #4667: URL: https://github.com/apache/polaris/pull/4667#discussion_r3416945717
########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApiService.java: ########## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; + +/** + * Service interface implemented by the runtime to handle OpenLineage ingest. Mirrors the pattern + * used by other Polaris API modules where the JAX-RS resource sits in the API module and + * delegates to a CDI-scoped service implementation in {@code polaris-runtime-service}. + */ +public interface PolarisOpenLineageApiService { + + /** + * Handle an OpenLineage event accepted at the ingest endpoint. + * + * @param event the parsed OpenLineage event, dispatched to the correct {@code RunEvent}, {@code + * JobEvent}, or {@code DatasetEvent} variant by Jackson based on the {@code schemaURL} + * field. + * @return the JAX-RS response. OpenLineage clients expect {@code 201 Created} with no body on + * success. + */ + Response sendLineageEvent( Review Comment: Yes, this is the right split. Will add OpenLineageIngestRequest (typed OL event + realm id as a plain string) and OpenLineageIngestResult (accept/reject/unavailable). This service maps the request context to an OpenLineageIngestRequest, calls the provider, then maps the result back to Response. Provider implementations never see SecurityContext or Response. Keeping OpenLineageIngestRequest deliberately thin for now — no persistence-shaped fields since there's no persistence yet. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApi.java: ########## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import io.micrometer.core.annotation.Timed; +import io.micrometer.core.aop.MeterTag; +import jakarta.annotation.security.RolesAllowed; +import jakarta.inject.Inject; +import jakarta.validation.Valid; +import jakarta.validation.constraints.NotNull; +import jakarta.ws.rs.Consumes; +import jakarta.ws.rs.POST; +import jakarta.ws.rs.Path; +import jakarta.ws.rs.core.Context; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.eclipse.microprofile.faulttolerance.Timeout; + +/** + * JAX-RS resource for the OpenLineage ingest endpoint. + * + * <p>Mounted at the standard OpenLineage path ({@code POST /api/v1/lineage}) so that any engine + * already using the OpenLineage HTTP transport (Spark, Flink, Airflow, Trino, dbt) can target + * Polaris by URL change alone — no client-side rewriting. + * + * <p>The body is parsed into a {@link PolarisLineageEvent} which Jackson resolves to one of {@code + * OfRunEvent} / {@code OfJobEvent} / {@code OfDatasetEvent} based on the {@code schemaURL} field. + * Unrecognized {@code schemaURL} values fall back to {@code RunEvent}, matching Marquez behavior. + * + * <p>This resource is hand-written rather than generated from the OpenLineage JSON Schema because + * the OpenAPI generator's Java template does not faithfully translate the spec's {@code oneOf} + * over event variants. + */ +@Path("/api/v1/lineage") Review Comment: > From recent community discussions, Polaris is trying to be a platform with replaceable capabilities, not just an opinionated OpenLineage receiver. I don't agree that this is a statement that has been widely agreed upon, to be honest. There's really no reasonable competition to the OpenLineage format in Open Source today. If we were to take this suggestion into account, we'd have to make a URL looking something like this: `https://polaris.com/openlineage/api/v1/lineage` and then have all clients configured to talk to the URL `https://polaris.com/openlineage/` - all of which I find awkward. If the community decides to do this, I will make this change, but I disagree with this excessive future-proofing. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApi.java: ########## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import io.micrometer.core.annotation.Timed; +import io.micrometer.core.aop.MeterTag; +import jakarta.annotation.security.RolesAllowed; +import jakarta.inject.Inject; +import jakarta.validation.Valid; +import jakarta.validation.constraints.NotNull; +import jakarta.ws.rs.Consumes; +import jakarta.ws.rs.POST; +import jakarta.ws.rs.Path; +import jakarta.ws.rs.core.Context; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.eclipse.microprofile.faulttolerance.Timeout; + +/** + * JAX-RS resource for the OpenLineage ingest endpoint. + * + * <p>Mounted at the standard OpenLineage path ({@code POST /api/v1/lineage}) so that any engine + * already using the OpenLineage HTTP transport (Spark, Flink, Airflow, Trino, dbt) can target + * Polaris by URL change alone — no client-side rewriting. + * + * <p>The body is parsed into a {@link PolarisLineageEvent} which Jackson resolves to one of {@code + * OfRunEvent} / {@code OfJobEvent} / {@code OfDatasetEvent} based on the {@code schemaURL} field. + * Unrecognized {@code schemaURL} values fall back to {@code RunEvent}, matching Marquez behavior. + * + * <p>This resource is hand-written rather than generated from the OpenLineage JSON Schema because + * the OpenAPI generator's Java template does not faithfully translate the spec's {@code oneOf} + * over event variants. + */ +@Path("/api/v1/lineage") +public class PolarisOpenLineageApi { + + private final PolarisOpenLineageApiService service; + + @Inject + public PolarisOpenLineageApi(PolarisOpenLineageApiService service) { + this.service = service; + } + + @POST + @Consumes("application/json") + @RolesAllowed("**") + @Timed("polaris.OpenLineageApi.sendLineageEvent") + @Timeout + public Response sendLineageEvent( + @NotNull @Valid PolarisLineageEvent event, Review Comment: I was hoping to make this change as small as possible as a PMC vote will be required on this change - so that's why the code is written in such a direct way. This is a reasonable change to make, but it comes at the cost of making the PR larger. I will make the change now since you've explicitly requested it. ########## runtime/service/src/main/java/org/apache/polaris/service/lineage/OpenLineageAdapter.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage; + +import jakarta.enterprise.context.RequestScoped; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.apache.polaris.service.lineage.api.PolarisLineageEvent; +import org.apache.polaris.service.lineage.api.PolarisOpenLineageApiService; + +/** + * No-op implementation of the OpenLineage ingest endpoint. + * + * <p>Accepts and discards events. Persistence, dataset resolution, and downstream forwarding will + * land in follow-up PRs as described in the Polaris OpenLineage proposal. + */ +@RequestScoped +public class OpenLineageAdapter implements PolarisOpenLineageApiService { + + @Override + public Response sendLineageEvent( + PolarisLineageEvent event, RealmContext realmContext, SecurityContext securityContext) { + return Response.status(Response.Status.CREATED).build(); Review Comment: Yes — will change this to OpenLineageIngestResult result = provider.ingest(request); return toResponse(result);. The no-op provider returns accepted, this adapter maps it to 201. Accept/reject stays in the provider; HTTP semantics stay here. ########## runtime/service/src/main/java/org/apache/polaris/service/lineage/OpenLineageAdapter.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage; + +import jakarta.enterprise.context.RequestScoped; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; +import org.apache.polaris.service.lineage.api.PolarisLineageEvent; +import org.apache.polaris.service.lineage.api.PolarisOpenLineageApiService; + +/** + * No-op implementation of the OpenLineage ingest endpoint. + * + * <p>Accepts and discards events. Persistence, dataset resolution, and downstream forwarding will + * land in follow-up PRs as described in the Polaris OpenLineage proposal. + */ +@RequestScoped +public class OpenLineageAdapter implements PolarisOpenLineageApiService { Review Comment: Agreed — will inject OpenLineageIngestProvider here and have the adapter call it. NoOpOpenLineageIngestProvider becomes the default CDI bean. Future persistence/forwarding PRs swap in a real provider without touching this adapter. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/LineageEventTypeResolver.java: ########## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import com.fasterxml.jackson.annotation.JsonTypeInfo; +import com.fasterxml.jackson.databind.DatabindContext; +import com.fasterxml.jackson.databind.JavaType; +import com.fasterxml.jackson.databind.jsontype.impl.TypeIdResolverBase; + +/** + * Resolves an OpenLineage event JSON body to one of {@link PolarisLineageEvent.OfRunEvent}, {@link + * PolarisLineageEvent.OfJobEvent}, or {@link PolarisLineageEvent.OfDatasetEvent} by inspecting the + * trailing path segment of the {@code schemaURL} field. + * + * <p>The OpenLineage spec requires every event to include a {@code schemaURL} pointing at the + * variant's JSON schema fragment, e.g. {@code + * https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunEvent}. The fragment name is the + * stable discriminator across spec versions. + * + * <p>If {@code schemaURL} is missing or unrecognized, the body is parsed as a {@code RunEvent} — Review Comment: Agreed. The schemaURL fallback is a parsing/wire-compatibility concern and belongs here in the adapter layer. OpenLineageIngestRequest will carry the already-dispatched typed event, so provider implementations never need to understand schemaURL semantics. ########## api/openlineage-service/src/main/java/org/apache/polaris/service/lineage/api/PolarisOpenLineageApiService.java: ########## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.service.lineage.api; + +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.core.SecurityContext; +import org.apache.polaris.core.context.RealmContext; + +/** + * Service interface implemented by the runtime to handle OpenLineage ingest. Mirrors the pattern + * used by other Polaris API modules where the JAX-RS resource sits in the API module and + * delegates to a CDI-scoped service implementation in {@code polaris-runtime-service}. Review Comment: Agreed — this should stay as the API/runtime delegation layer, not the provider contract. Will introduce OpenLineageIngestProvider behind it so this interface remains the HTTP/runtime seam and the provider seam is the extension point. ########## spec/openlineage-service.yaml: ########## @@ -0,0 +1,132 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + +# This file documents the OpenLineage ingest endpoint exposed by Apache Polaris. +# +# IMPORTANT: This spec is documentation, not a code-generation source. The +# OpenLineage event schema is a `oneOf` over RunEvent / JobEvent / DatasetEvent +# discriminated by the `schemaURL` field, which the OpenAPI Generator's Java +# templates cannot translate faithfully (they collapse the variants into a +# single class with every variant's required fields marked `@NotNull`, +# rejecting every valid event with a 400). Polaris instead hand-writes the +# JAX-RS resource on top of the official `io.openlineage:openlineage-java` +# library, exactly as Marquez (the OpenLineage reference server) does. See +# `api/openlineage-service/` for the implementation. +# +# The authoritative event schema lives upstream: +# https://openlineage.io/spec/2-0-2/OpenLineage.json +# +# This file is kept so the OpenLineage endpoint shows up alongside Polaris's +# other API specs and can be browsed by API tools. + +--- +openapi: 3.0.3 +info: + title: Apache Polaris OpenLineage Service + license: + name: Apache 2.0 + url: https://www.apache.org/licenses/LICENSE-2.0.html + version: 0.0.1 + description: + Defines the OpenLineage-compatible ingest API exposed by Apache Polaris. + Polaris accepts OpenLineage events at the standard OpenLineage path so + engines (Spark, Flink, Airflow, Trino, dbt) can be reconfigured by URL + alone. Event bodies follow the OpenLineage specification at + https://openlineage.io/spec/2-0-2/OpenLineage.json. + +servers: + - url: "{scheme}://{host}/api/v1" + description: Server URL when the port can be inferred from the scheme + variables: + scheme: + description: The scheme of the URI, either http or https. + default: https + host: + description: The host address for the specified server + default: localhost + +security: + - OAuth2: [] + - BearerAuth: [] + +paths: + /lineage: + post: + tags: + - OpenLineage API + summary: Submit an OpenLineage event + description: + Accepts an OpenLineage RunEvent, JobEvent, or DatasetEvent as defined + by the OpenLineage specification + (https://openlineage.io/spec/2-0-2/OpenLineage.json). + + + The body is dispatched to the correct event variant by inspecting the + `schemaURL` field — the fragment after the final `/` selects RunEvent, + JobEvent, or DatasetEvent. Unrecognized values fall back to RunEvent. + + + This endpoint is the standard OpenLineage ingest path, so existing + OpenLineage transports can target Polaris by URL change alone. + operationId: sendLineageEvent + requestBody: + description: + An OpenLineage event. The exact body schema is the upstream + OpenLineage 2-0-2 JSON Schema referenced above. + required: true + content: + application/json: + schema: + type: object + additionalProperties: true + externalDocs: + description: OpenLineage 2-0-2 JSON Schema (authoritative) + url: https://openlineage.io/spec/2-0-2/OpenLineage.json + responses: + 201: + description: + Created. The event was accepted for processing. Per the + OpenLineage server specification the response body is empty. + 400: + description: Bad Request - The request body is not a valid OpenLineage event + 401: + description: Unauthorized - The caller is not authenticated + 403: + description: Forbidden - The caller is not authorized to submit lineage events + 503: + description: Service Unavailable - The lineage subsystem is temporarily unable to accept events + 5XX: + description: Server error + +components: + securitySchemes: + OAuth2: + type: oauth2 + description: + OAuth2 client-credentials flow against the Polaris token endpoint. + The same client-id/secret used for catalog access is used here; only + an additional LINEAGE_INGEST privilege grant is required. Review Comment: Correct — that privilege language is ahead of the implementation. Will mark it as future work with a # TODO comment so it reads as planned behavior, not current contract. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
