singhpk234 commented on code in PR #3990: URL: https://github.com/apache/polaris/pull/3990#discussion_r3249352267
########## site/content/in-dev/unreleased/delegation-service.md: ########## @@ -0,0 +1,116 @@ +--- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +title: Delegation Service +type: docs +weight: 430 +--- + +A Delegation Service (D.S.) is a service that works alongside Polaris, either driving Polaris from outside or running inside the Polaris deployment to do work on its behalf. It can be deployed in one of two modes depending on who runs it and which way the calls flow: + +- **Pull**: the delegation service runs outside Polaris (e.g. a scheduled compaction or snapshot-expiration job) and calls Polaris over REST to fetch policies, table metadata, etc. +- **Push**: the delegation service is co-deployed with Polaris, inside the same security boundary. Polaris invokes it for heavy workloads that would otherwise degrade the Polaris service, such as intensive network calls, large I/O operations, or compute-heavy tasks. The delegation service is hidden behind the Polaris deployment; clients cannot reach it and do not need to know whether one is configured or which implementation is in use. + +The two modes solve different problems. Pull supports external systems that integrate with Polaris. Push lets a Polaris deployment offload internal work (e.g. table file purge on `DROP ... PURGE`, server-side scan planning) without changing the public API. A single deployment can use both. + +``` +Pull mode (delegation service is external): + + ┌──────────────────────┐ REST (pull) ┌────────────┐ + │ Delegation service │ ─────────────────► │ Polaris │ + │ (compute engine, │ │ │ + │ maintenance job) │ └────────────┘ + └──────────────────────┘ + + +Push mode (delegation service is internal, invisible to clients): + + ┌── Polaris deployment ─────────────────┐ + │ │ + ┌────────┐ REST │ ┌──────────┐ internal ┌──────┐ │ + │ Client │ ────────► │ ─►│ Polaris │ ────────────► │ D.S. │ │ + └────────┘ │ └──────────┘ └──────┘ │ + │ │ + └───────────────────────────────────────┘ +``` + +## Pull mode + +Pull mode is the natural fit for **table maintenance services**: data compaction, snapshot expiration, orphan file removal, manifest rewriting, and similar background jobs. These services run on their own schedule, decide which tables to act on, and need policies and metadata from Polaris to drive that work. + +In pull mode, the delegation service talks to Polaris **exclusively over REST APIs**: the Iceberg REST Catalog (IRC) endpoints +for table operations (load, commit, list, credential vending), and the Polaris REST endpoints for catalog-specific resources +such as policies and generic tables. There is no SDK, callback; every interaction is an outbound HTTP request. Authentication +uses OAuth2, the standard Polaris auth path; The full REST surface is in the [API specs](#api-specs). + +### Example: external compaction service + +``` +1. POST /v1/oauth/tokens (auth) +2. GET /v1/{cat}/namespaces/{ns}/tables (discover tables) +3. GET /polaris/v1/{cat}/applicable-policies (pull policy) + ?namespace={ns}&target-name={tbl} + &policyType=system.data-compaction +4. If "enable": true, run compaction with the parameters from policy.content +5. Repeat on schedule +``` + +## Push mode + +In push mode, the delegation service is co-deployed with the Polaris, in the same security boundary as Polaris itself. Polaris invokes it for heavy workloads that would otherwise degrade the Polaris service, such as intensive network calls, large I/O operations, or compute-heavy tasks. External clients cannot reach the delegation service directly, and they cannot tell whether or which one is deployed; Polaris remains the only public entry point. + +### Properties + +- **Same security boundary as Polaris.** The delegation service is reachable only by Polaris, deployed alongside it (e.g., separate pod within the same trust zone). It can be granted credentials and access that would be unsafe to vend to clients. +- **No public contract.** The wire protocol between Polaris and the delegation service is internal and may evolve. Clients see only the Polaris REST API. Review Comment: I see, do you have any specific protocol in mind ? http vs proto ? ########## site/content/in-dev/unreleased/delegation-service.md: ########## @@ -0,0 +1,116 @@ +--- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +title: Delegation Service +type: docs +weight: 430 +--- + +A Delegation Service (D.S.) is a service that works alongside Polaris, either driving Polaris from outside or running inside the Polaris deployment to do work on its behalf. It can be deployed in one of two modes depending on who runs it and which way the calls flow: + +- **Pull**: the delegation service runs outside Polaris (e.g. a scheduled compaction or snapshot-expiration job) and calls Polaris over REST to fetch policies, table metadata, etc. +- **Push**: the delegation service is co-deployed with Polaris, inside the same security boundary. Polaris invokes it for heavy workloads that would otherwise degrade the Polaris service, such as intensive network calls, large I/O operations, or compute-heavy tasks. The delegation service is hidden behind the Polaris deployment; clients cannot reach it and do not need to know whether one is configured or which implementation is in use. + +The two modes solve different problems. Pull supports external systems that integrate with Polaris. Push lets a Polaris deployment offload internal work (e.g. table file purge on `DROP ... PURGE`, server-side scan planning) without changing the public API. A single deployment can use both. + +``` +Pull mode (delegation service is external): + + ┌──────────────────────┐ REST (pull) ┌────────────┐ + │ Delegation service │ ─────────────────► │ Polaris │ + │ (compute engine, │ │ │ + │ maintenance job) │ └────────────┘ + └──────────────────────┘ + + +Push mode (delegation service is internal, invisible to clients): + + ┌── Polaris deployment ─────────────────┐ + │ │ + ┌────────┐ REST │ ┌──────────┐ internal ┌──────┐ │ + │ Client │ ────────► │ ─►│ Polaris │ ────────────► │ D.S. │ │ + └────────┘ │ └──────────┘ └──────┘ │ + │ │ + └───────────────────────────────────────┘ +``` + +## Pull mode Review Comment: Pull mode here is then, just like any other rest client to the catalog i wonder if we should add that in DS design, or we want this from reference pov if someone wants to do it ? ########## site/content/in-dev/unreleased/delegation-service.md: ########## @@ -0,0 +1,116 @@ +--- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +title: Delegation Service +type: docs +weight: 430 +--- + +A Delegation Service (D.S.) is a service that works alongside Polaris, either driving Polaris from outside or running inside the Polaris deployment to do work on its behalf. It can be deployed in one of two modes depending on who runs it and which way the calls flow: + +- **Pull**: the delegation service runs outside Polaris (e.g. a scheduled compaction or snapshot-expiration job) and calls Polaris over REST to fetch policies, table metadata, etc. +- **Push**: the delegation service is co-deployed with Polaris, inside the same security boundary. Polaris invokes it for heavy workloads that would otherwise degrade the Polaris service, such as intensive network calls, large I/O operations, or compute-heavy tasks. The delegation service is hidden behind the Polaris deployment; clients cannot reach it and do not need to know whether one is configured or which implementation is in use. + +The two modes solve different problems. Pull supports external systems that integrate with Polaris. Push lets a Polaris deployment offload internal work (e.g. table file purge on `DROP ... PURGE`, server-side scan planning) without changing the public API. A single deployment can use both. + +``` +Pull mode (delegation service is external): + + ┌──────────────────────┐ REST (pull) ┌────────────┐ + │ Delegation service │ ─────────────────► │ Polaris │ + │ (compute engine, │ │ │ + │ maintenance job) │ └────────────┘ + └──────────────────────┘ + + +Push mode (delegation service is internal, invisible to clients): + + ┌── Polaris deployment ─────────────────┐ + │ │ + ┌────────┐ REST │ ┌──────────┐ internal ┌──────┐ │ + │ Client │ ────────► │ ─►│ Polaris │ ────────────► │ D.S. │ │ + └────────┘ │ └──────────┘ └──────┘ │ + │ │ + └───────────────────────────────────────┘ +``` + +## Pull mode + +Pull mode is the natural fit for **table maintenance services**: data compaction, snapshot expiration, orphan file removal, manifest rewriting, and similar background jobs. These services run on their own schedule, decide which tables to act on, and need policies and metadata from Polaris to drive that work. + +In pull mode, the delegation service talks to Polaris **exclusively over REST APIs**: the Iceberg REST Catalog (IRC) endpoints +for table operations (load, commit, list, credential vending), and the Polaris REST endpoints for catalog-specific resources +such as policies and generic tables. There is no SDK, callback; every interaction is an outbound HTTP request. Authentication +uses OAuth2, the standard Polaris auth path; The full REST surface is in the [API specs](#api-specs). + +### Example: external compaction service + +``` +1. POST /v1/oauth/tokens (auth) +2. GET /v1/{cat}/namespaces/{ns}/tables (discover tables) +3. GET /polaris/v1/{cat}/applicable-policies (pull policy) + ?namespace={ns}&target-name={tbl} + &policyType=system.data-compaction +4. If "enable": true, run compaction with the parameters from policy.content +5. Repeat on schedule +``` + +## Push mode + +In push mode, the delegation service is co-deployed with the Polaris, in the same security boundary as Polaris itself. Polaris invokes it for heavy workloads that would otherwise degrade the Polaris service, such as intensive network calls, large I/O operations, or compute-heavy tasks. External clients cannot reach the delegation service directly, and they cannot tell whether or which one is deployed; Polaris remains the only public entry point. + +### Properties + +- **Same security boundary as Polaris.** The delegation service is reachable only by Polaris, deployed alongside it (e.g., separate pod within the same trust zone). It can be granted credentials and access that would be unsafe to vend to clients. +- **No public contract.** The wire protocol between Polaris and the delegation service is internal and may evolve. Clients see only the Polaris REST API. +- **Pluggable, opaque to clients.** Whether a delegation service is configured, and which implementation runs (e.g. an async worker for purge, an engine-aware planner for scan planning), is a deployment-time decision. The same client request behaves identically from the client's point of view regardless of which one is in use. Review Comment: +1, client should just think they are interacting with polaris -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
