This is an automated email from the ASF dual-hosted git repository.

yufei pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/polaris.git


The following commit(s) were added to refs/heads/main by this push:
     new a496a6fc3 Site: Add docs for catalog federation (#2761)
a496a6fc3 is described below

commit a496a6fc312ba69d69e95ee50ba541ccb077cfe6
Author: Yufei Gu <[email protected]>
AuthorDate: Fri Oct 10 17:06:26 2025 -0700

    Site: Add docs for catalog federation (#2761)
---
 .../content/in-dev/unreleased/federation/_index.md |  26 +++++
 .../federation/hive-metastore-federation.md        | 125 +++++++++++++++++++++
 .../federation/iceberg-rest-federation.md          |  71 ++++++++++++
 3 files changed, 222 insertions(+)

diff --git a/site/content/in-dev/unreleased/federation/_index.md 
b/site/content/in-dev/unreleased/federation/_index.md
new file mode 100644
index 000000000..e4fbe261a
--- /dev/null
+++ b/site/content/in-dev/unreleased/federation/_index.md
@@ -0,0 +1,26 @@
+---
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+title: Federation
+type: docs
+weight: 703
+---
+
+Guides for federating Polaris with existing metadata services. Expand this 
section to select a
+specific integration.
diff --git 
a/site/content/in-dev/unreleased/federation/hive-metastore-federation.md 
b/site/content/in-dev/unreleased/federation/hive-metastore-federation.md
new file mode 100644
index 000000000..0d39a5e4a
--- /dev/null
+++ b/site/content/in-dev/unreleased/federation/hive-metastore-federation.md
@@ -0,0 +1,125 @@
+---
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+title: Hive Metastore Federation
+type: docs
+weight: 705
+---
+
+Polaris can federate catalog operations to an existing Hive Metastore (HMS). 
This lets an external
+HMS remain the source of truth for table metadata while Polaris brokers 
access, policies, and
+multi-engine connectivity.
+
+## Build-time enablement
+
+The Hive factory is packaged as an optional extension and is not baked into 
default server builds.
+Include it when assembling the runtime or container images by setting the 
`NonRESTCatalogs` Gradle
+property to include `HIVE` (and any other non-REST backends you need):
+
+```bash
+./gradlew :polaris-server:assemble :polaris-server:quarkusAppPartsBuild 
--rerun \
+  -DNonRESTCatalogs=HIVE -Dquarkus.container-image.build=true
+```
+
+`runtime/server/build.gradle.kts` wires the extension in only when this flag 
is present, so binaries
+built without it will reject Hive federation requests.
+
+## Runtime requirements
+
+- **Metastore connectivity:** Expose the HMS Thrift endpoint 
(`thrift://host:port`) to the Polaris
+  deployment.
+- **Configuration discovery:** Iceberg’s `HiveCatalog` loads Hadoop/Hive 
client settings from the
+  classpath. Provide `hive-site.xml` (and `core-site.xml` if needed) via
+  `HADOOP_CONF_DIR`/`HIVE_CONF_DIR` or an image layer.
+- **Authentication:** Hive federation only supports `IMPLICIT` authentication, 
meaning Polaris uses
+  the operating-system or Kerberos identity of the running process (no stored 
secrets). Ensure the
+  service principal is logged in or holds a valid keytab/TGT before starting 
Polaris.
+- **Object storage role:** Configure 
`polaris.service-identity.<realm>.aws-iam.*` (or the default
+  realm) so the server can assume the AWS role referenced by the catalog. The 
IAM role must allow
+  STS access from the Polaris service identity and grant permissions to the 
table locations.
+
+### Kerberos setup example
+
+If your Hive Metastore enforces Kerberos, stage the necessary configuration 
alongside Polaris:
+
+```bash
+export KRB5_CONFIG=/etc/polaris/krb5.conf
+export HADOOP_CONF_DIR=/etc/polaris/hadoop-conf   # contains hive-site.xml 
with HMS principal
+export HADOOP_OPTS="-Djava.security.auth.login.config=/etc/polaris/jaas.conf"
+kinit -kt /etc/polaris/keytabs/polaris.keytab polaris/[email protected]
+```
+
+- `hive-site.xml` must define `hive.metastore.sasl.enabled=true`, the 
metastore principal, and
+  client principal pattern (for example 
`hive.metastore.client.kerberos.principal=polaris/_HOST@REALM`).
+- The JAAS entry (referenced by `java.security.auth.login.config`) should use 
`useKeyTab=true` and
+  point to the same keytab shown above so the Polaris JVM can refresh 
credentials automatically.
+- Keep the keytab readable solely by the Polaris service user; the implicit 
authenticator consumes
+  the TGT at startup and for periodic renewal.
+
+## Creating a federated catalog
+
+Use the Management API (or the Python CLI) to create an external catalog whose 
connection type is
+`HIVE`. The following request registers a catalog that proxies to an HMS 
running on
+`thrift://hms.example.internal:9083`:
+
+```bash
+curl -X POST https://<polaris-host>/management/v1/catalogs \
+  -H "Authorization: Bearer $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "type": "EXTERNAL",
+        "name": "analytics_hms",
+        "storageConfigInfo": {
+          "storageType": "S3",
+          "roleArn": "arn:aws:iam::123456789012:role/polaris-warehouse-access",
+          "region": "us-east-1"
+        },
+        "properties": { "default-base-location": 
"s3://analytics-bucket/warehouse/" },
+        "connectionConfigInfo": {
+          "connectionType": "HIVE",
+          "uri": "thrift://hms.example.internal:9083",
+          "warehouse": "s3://analytics-bucket/warehouse/",
+          "authenticationParameters": { "authenticationType": "IMPLICIT" }
+        }
+      }'
+```
+
+Grant catalog roles to principal roles exactly as you would for internal 
catalogs so engines can
+obtain tokens that authorize against the federated metadata.
+
+`default-base-location` is required; it tells Polaris and Iceberg where to 
place new metadata files.
+`allowedLocations` is optional—supply it only when you want to restrict 
writers to a specific set of
+prefixes. If your IAM trust policy requires an `externalId` or explicit 
`userArn`, include those
+optional fields in `storageConfigInfo`. Polaris persists them and supplies 
them when assuming the
+role cited by `roleArn` during metadata commits.
+
+## Limitations and operational notes
+
+- **Single identity:** Because only `IMPLICIT` authentication is permitted, 
Polaris cannot mix
+  multiple Hive identities in a single deployment 
(`HiveFederatedCatalogFactory` rejects other auth
+  types). Plan a deployment topology that aligns the Polaris process identity 
with the target HMS.
+- **Generic tables:** The Hive extension exposes Iceberg tables registered in 
HMS. Generic table
+  federation is not implemented 
(`HiveFederatedCatalogFactory#createGenericCatalog` throws
+  `UnsupportedOperationException`).
+- **Configuration caching:** Atlas-style catalog failover and multi-HMS 
routing are not yet handled;
+  Polaris initializes one `HiveCatalog` per connection and relies on the 
underlying Iceberg client
+  for retries.
+
+With these constraints satisfied, Polaris can sit in front of an HMS so that 
Iceberg tables managed
+there gain OAuth-protected, multi-engine access through the Polaris REST APIs.
diff --git 
a/site/content/in-dev/unreleased/federation/iceberg-rest-federation.md 
b/site/content/in-dev/unreleased/federation/iceberg-rest-federation.md
new file mode 100644
index 000000000..8318f4509
--- /dev/null
+++ b/site/content/in-dev/unreleased/federation/iceberg-rest-federation.md
@@ -0,0 +1,71 @@
+---
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+title: Iceberg REST Federation
+type: docs
+weight: 704
+---
+
+Polaris can federate an external Iceberg REST catalog (e.g., another Polaris 
deployment, AWS Glue, or a custom Iceberg
+REST implementation), enabling a Polaris service to access table and view 
entities managed by remote Iceberg REST Catalogs.
+
+## Runtime requirements
+
+- **REST endpoint:** The remote service must expose the Iceberg REST 
specification. Configure
+  firewalls so Polaris can reach the base URI you provide in the connection 
config.
+- **Authentication:** Polaris forwards requests using the credentials defined 
in
+  `ConnectionConfigInfo.AuthenticationParameters`. OAuth2 client credentials, 
bearer tokens, and AWS
+  SigV4 are supported; choose the scheme the remote service expects.
+
+## Creating a federated REST catalog
+
+The snippet below registers an external catalog that forwards to a remote 
Polaris server using OAuth2
+client credentials. `iceberg-remote-catalog-name` is optional; supply it when 
the remote server multiplexes
+multiple logical catalogs under one URI.
+
+```bash
+polaris catalogs create \
+    --type EXTERNAL \
+    --storage-type s3 \
+    --role-arn "arn:aws:iam::123456789012:role/polaris-warehouse-access" \
+    --default-base-location "s3://analytics-bucket/warehouse/" \
+    --catalog-connection-type iceberg-rest \
+    --iceberg-remote-catalog-name analytics \
+    --catalog-uri "https://remote-polaris.example.com/catalog/v1"; \
+    --catalog-authentication-type OAUTH \
+    --catalog-token-uri 
"https://remote-polaris.example.com/catalog/v1/oauth/tokens"; \
+    --catalog-client-id "<remote-client-id>" \
+    --catalog-client-secret "<remote-client-secret>" \
+    --catalog-client-scopes "PRINCIPAL_ROLE:ALL" \
+    analytics_rest
+```
+
+Refer to the [CLI documentation](../command-line-interface.md#catalogs) for 
details on alternative authentication types such as BEARER or SIGV4.
+
+Grant catalog roles to principal roles the same way you do for internal 
catalogs so compute engines
+receive tokens with access to the federated namespace.
+
+## Operational notes
+
+- **Connectivity checks:** Polaris does not lazily probe the remote service; 
catalog creation fails if
+  the REST endpoint is unreachable or authentication is rejected.
+- **Feature parity:** Federation exposes whatever table/namespace operations 
the remote service
+  implements. Unsupported features return the remote error directly to callers.
+- **Generic tables:** The REST federation path currently surfaces Iceberg 
tables only; generic table
+  federation is not implemented.

Reply via email to