This is an automated email from the ASF dual-hosted git repository.

difin pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/hive-site.git


The following commit(s) were added to refs/heads/main by this push:
     new 1bf826f  HIVE-29285: Iceberg: Add quick start documentation for 
docker-compose REST Catalog integrations with Gravitino and Polaris. (#68)
1bf826f is described below

commit 1bf826feb538d9d18993140ad13b066316b765ae
Author: Dmitriy Fingerman <[email protected]>
AuthorDate: Fri Oct 31 21:03:25 2025 -0400

    HIVE-29285: Iceberg: Add quick start documentation for docker-compose REST 
Catalog integrations with Gravitino and Polaris. (#68)
    
    Co-authored-by: Dmitriy Fingerman <[email protected]>
---
 content/Development/quickStart.md              |   3 +-
 content/docs/latest/quickstart-rest-catalog.md | 399 +++++++++++++++++++++++++
 2 files changed, 401 insertions(+), 1 deletion(-)

diff --git a/content/Development/quickStart.md 
b/content/Development/quickStart.md
index 26034e2..c3ecff2 100644
--- a/content/Development/quickStart.md
+++ b/content/Development/quickStart.md
@@ -291,4 +291,5 @@ docker compose exec hiveserver2-standalone /bin/bash
 exit
 ```
   
-[/packaging/src/docker/build.sh]: 
https://github.com/apache/hive/blob/master/packaging/src/docker/build.sh
+### Quick Start with REST Catalog Integration
+Checkout the quickstart of REST Catalog Integration with Docker here: [REST 
Catalog 
Integration](https://hive.apache.org/docs/latest/quickstart-rest-catalog)
diff --git a/content/docs/latest/quickstart-rest-catalog.md 
b/content/docs/latest/quickstart-rest-catalog.md
new file mode 100644
index 0000000..bc2fb95
--- /dev/null
+++ b/content/docs/latest/quickstart-rest-catalog.md
@@ -0,0 +1,399 @@
+---
+title: "Hive 4.2.0 - REST Catalog Integration"
+date: 2025-10-31
+draft: false
+---
+
+# REST Catalog Integration
+
+## Table of Contents
+- [Hive + Gravitino + Keycloak](#hive--gravitino--keycloak)
+  - [Architecture Overview](#architecture-overview)
+  - [Prerequisites](#prerequisites)
+  - [Quickstart](#quickstart)
+  - [Configuration](#configuration)
+    - [Keyclock](#keycloak)
+    - [Gravitino](#gravitino)
+    - [Hive](#hive)
+  - [Networking Notes](#networking-notes)
+- [Hive + Polaris](#hive--polaris)
+  - [Architecture Overview](#architecture-overview-1)
+  - [Prerequisites](#prerequisites-1)
+  - [Quickstart](#quickstart-1)
+  - [Configuration](#configuration-1)
+      - [Polaris](#polaris)
+      - [Hive](#hive-1)
+  - [Networking Notes](#networking-notes-1)
+
+## Hive + Gravitino + Keycloak
+
+- The code for this setup is located in the Hive repository in 
`packaging/src/docker/thirdparties/gravitino` folder.
+- It contains a docker-compose-based setup integrating Apache Hive, Gravitino 
Iceberg REST server, and Keycloak for OAuth2 authentication. It allows Hive to 
use an Iceberg REST catalog secured via Keycloak.
+
+### Architecture Overview
+This diagram illustrates the key docker-compose components and their 
interactions in this setup:
+
+```
+                                  oAuth2 (REST API)
+         +-------------------------------------------------------------------+
+         |                                                                   |
+         |                                                                   v
++--------+----------+               +-------------------+            
+-----------------+
+|                   |  RESTCatalog  |                   |   oauth2   |         
        |
+|     Hive          |   (REST API)  |      Gravitino    | (REST API) |    
Keycloak     |
+|  (HiveServer2)    +-------------->|    Iceberg REST   +----------->|  OAuth2 
Auth    |
+|                   |               |       Server      |            |     
Server      |
++--------+----------+               +---------+---------+            
+-----------------+
+         |                                    |                    
+  data   |          metadata files            |                    
+  files  +------------------------------------+                    
+         |                                                 
+         v                                                 
++-------------------+               +-------------------+     
+|                   |  creates dir  |                   |     
+|     /warehouse    |<--------------+       init        |
+|  (Docker volume)  |     sets      |     container     |
+|                   |  permissions  |                   |
++-------------------+               +-------------------+
+```
+
+- Hive:
+    - Runs HiveServer2, connects to Gravitino via Iceberg REST catalog.
+    - Write Iceberg data files to the shared warehouse volume.
+- Gravitino:
+    - Exposes REST API for Iceberg catalog.
+    - Writes Iceberg metadata files to shared warehouse volume 
(.metadata.json).
+    - Doesn't supports serving as oauth2 provider, so this example uses an 
external OAuth2 provider (Keyclock).
+- Keycloak:
+    - OAuth2 server providing authentication and token issuance for 
Hive/Gravitino.
+- /warehouse:
+    - Shared Docker volume for Iceberg table data and metadata.
+- Init container:
+    - Creates shared /warehouse folder and sets filesystem permissions as a 
one time initialization step.
+
+### Prerequisites
+- Hive version 4.2.0+
+- Docker & Docker Compose
+- Java (for local Hive beeline client)
+- ```$HIVE_HOME``` environment variable pointing to Hive installation (for 
connecting to Beeline)
+
+### Quickstart
+
+#### STEP 1: Export the Hive version
+```shell
+export HIVE_VERSION=4.2.0
+```
+
+#### STEP 2: Start services
+```shell
+docker-compose up -d
+```
+
+#### STEP 3: Connect to beeline
+```shell
+"${HIVE_HOME}/bin/beeline" -u "jdbc:hive2://localhost:10001/default" -n hive 
-p hive
+```
+
+#### STEP 4: Stop services:
+```shell
+docker-compose down -v
+```
+
+### Configuration
+
+#### Keycloak
+
+- Realm: hive
+- Client: iceberg-client
+    - Secret: iceberg-client-secret
+    - Protocol: OpenID Connect
+    - Audience: hive-iceberg
+- Imported via `realm-export.json` in Keycloak container.
+- Port: 8080
+
+#### Gravitino
+
+- HTTP port: 9001
+- Catalog backend: JDBC H2 (/tmp/gravitino_h2_db)
+- Warehouse: /warehouse (shared with Hive)
+- Iceberg REST Catalog Backend config:
+    ```
+    # Backend type for the catalog. Here we use JDBC (H2 database) as the 
metadata store.
+    gravitino.iceberg-rest.catalog-backend = jdbc
+    
+    # JDBC connection URI for the H2 database storing catalog metadata.
+    gravitino.iceberg-rest.uri = 
jdbc:h2:file:/tmp/gravitino_h2_db;AUTO_SERVER=TRUE
+    
+    # JDBC driver class used to connect to the metadata database.
+    gravitino.iceberg-rest.jdbc-driver = org.h2.Driver
+    
+    # Database username for connecting to the metadata store.
+    gravitino.iceberg-rest.jdbc-user = sa
+    
+    # Database password for connecting to the metadata store (empty here).
+    gravitino.iceberg-rest.jdbc-password = ""
+    
+    # Whether to initialize the catalog schema on startup.
+    gravitino.iceberg-rest.jdbc-initialize = true
+    
+    # --- Warehouse Location (shared folder) ---
+    
+    # Path to the Iceberg warehouse directory shared with Hive.
+    gravitino.iceberg-rest.warehouse = file:///warehouse
+    ```
+- OAuth2 config pointing to Keycloak:
+    ```
+    # Enables OAuth2 as the authentication mechanism for Gravitino.
+    gravitino.authenticators = oauth
+    
+    # URL of the Keycloak realm to request tokens from.
+    gravitino.authenticator.oauth.serverUri = http://keycloak:8080/realms/hive
+    
+    # Path to the OAuth2 token endpoint on Keycloak.
+    gravitino.authenticator.oauth.tokenPath = /protocol/openid-connect/token
+    
+    # OAuth2 scopes requested when obtaining a token. Includes "openid" and 
the custom "catalog" scope.
+    gravitino.authenticator.oauth.scope = openid catalog
+    
+    # OAuth2 client ID registered in Keycloak.
+    gravitino.authenticator.oauth.clientId = iceberg-client
+    
+    # OAuth2 client secret associated with the client ID.
+    gravitino.authenticator.oauth.clientSecret = iceberg-client-secret
+    
+    # Java class used to validate incoming JWT tokens using the JWKS endpoint.
+    gravitino.authenticator.oauth.tokenValidatorClass = 
org.apache.gravitino.server.authentication.JwksTokenValidator
+    
+    # URL to fetch JSON Web Key Set (JWKS) for verifying token signatures.
+    gravitino.authenticator.oauth.jwksUri = 
http://keycloak:8080/realms/hive/protocol/openid-connect/certs
+    
+    # Identifier for the OAuth2 provider configuration in Gravitino.
+    gravitino.authenticator.oauth.provider = default
+    
+    # JWT claim field(s) to extract as the principal/username (here, 'sub' 
claim).
+    gravitino.authenticator.oauth.principalFields = sub
+    
+    # Acceptable clock skew (in seconds) when validating token expiration 
times.
+    gravitino.authenticator.oauth.allowSkewSecs = 60
+    
+    # Expected audience claim in the token to ensure it is intended for this 
service.
+    gravitino.authenticator.oauth.serviceAudience = hive-iceberg
+    ```
+
+#### Hive
+
+- Uses ```HiveRESTCatalogClient``` for connecting to Iceberg REST catalog 
(Gravitino).
+- Catalog configuration in ```hive-site.xml```:
+    ```
+    <property>
+      <name>metastore.catalog.default</name>
+      <value>ice01</value>
+      <description>Sets the default Iceberg catalog for Hive. Here, "ice01" is 
used.</description>
+    </property>
+    
+    <property>
+      <name>metastore.client.impl</name>
+      <value>org.apache.iceberg.hive.client.HiveRESTCatalogClient</value>
+      <description>Specifies the client implementation to use for accessing 
Iceberg via REST.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.uri</name>
+      <value>http://gravitino:9001/iceberg</value>
+      <description>URI of the Iceberg REST server (Gravitino). Hive will send 
catalog requests here.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.type</name>
+      <value>rest</value>
+      <description>Defines the catalog type as "rest", indicating it uses a 
REST API backend.</description>
+    </property>
+    
+    <!-- Iceberg REST Catalog: OAuth2 authentication -->
+    
+    <property>
+      <name>iceberg.catalog.ice01.rest.auth.type</name>
+      <value>oauth2</value>
+      <description>Configures Hive to use OAuth2 for authenticating requests 
to the REST catalog.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.oauth2-server-uri</name>
+      
<value>http://keycloak:8080/realms/hive/protocol/openid-connect/token</value>
+      <description>URL of the Keycloak OAuth2 token endpoint used to request 
access tokens.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.credential</name>
+      <value>iceberg-client:iceberg-client-secret</value>
+      <description>Client credentials (ID and secret) used to authenticate 
with Keycloak.</description>
+    </property>
+    ```
+- HiveServer2 port: 10000 (mapped to 10001 in Docker Compose)
+
+### Networking Notes
+
+- All containers share a custom bridge network ```hive-net```.
+- Services communicate via container names: hive, gravitino, keycloak.
+- Ports mapped for host access:
+    - Keycloak → 8080
+    - Gravitino → 9001
+    - HiveServer2 → 10001
+
+## Hive + Polaris
+
+- The code for this setup is located in the Hive repository in 
`packaging/src/docker/thirdparties/polaris` folder.
+- It contains contains a docker-compose-based setup integrating Apache Hive 
and Polaris.
+- It allows Hive to use an Iceberg REST catalog secured with oAuth2 provided 
by Polaris.
+
+### Architecture Overview
+This diagram illustrates the key docker-compose components and their 
interactions in this setup:
+```
++-------------------+               +-------------------+
+|                   |  RESTCatalog  |                   |
+|     Hive          |   (REST API)  |      Polaris      |<-------+
+|  (HiveServer2)    +-------------->|      Server       |        |
+|                   |    oAuth2     |                   |        |  
++--------+----------+  (REST API)   +---------+---------+        | creates:
+         |                                    |                  |     catalog,
+  data   |           metadata files           |                  |     
principal,
+  files  +------------------------------------+                  |     roles,
+         |                                                       |     grants 
(REST API)
+         v                                                       |
++-------------------+               +-------------------+        |
+|                   |  creates dir  |                   |        |
+|     /warehouse    |<--------------+    Polaris-init   +--------+
+|  (Docker volume)  |     syncs     |      container    |
+|                   |  permissions  |                   |
++-------------------+               +-------------------+
+```
+
+- Hive:
+    - Runs HiveServer2, connects to Polaris via Iceberg REST catalog.
+    - Write Iceberg data files to shared warehouse volume.
+- Polaris:
+    - Exposes REST API for Iceberg catalog and provides oauth2 for 
authentication.
+    - Supports serving as oauth2 provider, so this example doesn't need an 
external OAuth2 component.
+    - Writes Iceberg metadata files to shared warehouse volume 
(.metadata.json).
+- /warehouse:
+    - Shared Docker volume for Iceberg table data and metadata.
+- Polaris-init
+    - Bootstraps Polaris for Hive-Iceberg.
+    - Creates and configures Polaris resources via REST API.
+    - Continuously synchronizes filesystem permissions for the shared 
/warehouse/* folders.
+        - required because Polaris and Hive run as different users in their 
respective containers.
+
+### Prerequisites
+- Hive version 4.2.0+
+- Docker & Docker Compose
+- Java (for local Hive beeline client)
+- ```$HIVE_HOME``` environment variable pointing to Hive installation (for 
connecting to Beeline)
+
+### Quickstart
+
+#### STEP 1: Export the Hive version
+```shell
+export HIVE_VERSION=4.2.0
+```
+
+#### STEP 2: Start services
+```shell
+docker-compose up -d
+```
+
+#### STEP 3: Connect to beeline
+```shell
+"${HIVE_HOME}/bin/beeline" -u "jdbc:hive2://localhost:10001/default" -n hive 
-p hive
+```
+
+#### STEP 4: Stop services:
+```shell
+docker-compose down -v
+```
+
+### Configuration
+
+#### Polaris
+
+- HTTP port: 8181
+- Warehouse: /warehouse (shared with Hive)
+- Key Polaris configs (defined via env variables in docker-compose.yml) :
+     ```
+      # A realm provides logical isolation for different Polaris environments.
+      polaris.realm-context.realms: POLARIS
+  
+      # Initial bootstrap credentials for the Polaris server.
+      # The format is: <realm-name>,<client-id>,<client-secret>
+      POLARIS_BOOTSTRAP_CREDENTIALS: 
POLARIS,iceberg-client,iceberg-client-secret`
+    ```
+
+#### Hive
+
+- Uses ```HiveRESTCatalogClient``` for connecting to Iceberg REST catalog 
(Polaris).
+- Catalog configuration in ```hive-site.xml```:
+    ```
+    <property>
+      <name>metastore.catalog.default</name>
+      <value>ice01</value>
+      <description>Sets the default Iceberg catalog for Hive. Here, "ice01" is 
used.</description>
+    </property>
+    
+    <property>
+      <name>metastore.client.impl</name>
+      <value>org.apache.iceberg.hive.client.HiveRESTCatalogClient</value>
+      <description>Specifies the client implementation to use for accessing 
Iceberg via REST.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.uri</name>
+      <value>http://polaris:8181/api/catalog</value>
+      <description>URI of the Iceberg REST server (Polaris). Hive will send 
catalog requests here.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.type</name>
+      <value>rest</value>
+      <description>Defines the catalog type as "rest", indicating it uses a 
REST API backend.</description>
+    </property>
+  
+    <property>
+      <name>hive.metastore.warehouse.dir</name>
+      <value>file:///warehouse</value>
+      <description>Defines the warehouse location, required for 
Polaris</description>
+    </property>
+    
+    <!-- Iceberg REST Catalog: OAuth2 authentication -->
+    
+    <property>
+      <name>iceberg.catalog.ice01.rest.auth.type</name>
+      <value>oauth2</value>
+      <description>Configures Hive to use OAuth2 for authenticating requests 
to the REST catalog.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.oauth2-server-uri</name>
+      <value>http://polaris:8181/api/catalog/v1/oauth/tokens</value>
+      <description>URL of the Polaris OAuth2 token endpoint used to request 
access tokens.</description>
+    </property>
+    
+    <property>
+      <name>iceberg.catalog.ice01.credential</name>
+      <value>iceberg-client:iceberg-client-secret</value>
+      <description>Client credentials (ID and secret) used to authenticate 
with Keycloak.</description>
+    </property>
+  
+    <property>
+      <name>iceberg.catalog.ice01.scope</name>
+      <value>PRINCIPAL_ROLE:ALL</value>
+      <description>oAuth2 scope tied to the principal role defined in 
Polaris</description>
+    </property>
+    ```
+- HiveServer2 port: 10000 (mapped to 10001 in Docker Compose)
+
+### Networking Notes
+
+- All containers share a custom bridge network ```hive-net```.
+- Services communicate via container names: hive and polaris
+- Ports mapped for host access:
+    - Polaris → 8181
+    - HiveServer2 → 10001

Reply via email to