This is an automated email from the ASF dual-hosted git repository.
fanng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git
The following commit(s) were added to refs/heads/main by this push:
new b79bb5267a [#7119] Improvement(dev): Support environment variable
injection for gravitino.conf (#7166)
b79bb5267a is described below
commit b79bb5267a38f780acd978b6fb0838a3db2093c1
Author: Kyle Lin <[email protected]>
AuthorDate: Fri May 23 14:09:30 2025 +0800
[#7119] Improvement(dev): Support environment variable injection for
gravitino.conf (#7166)
### What changes were proposed in this pull request?
Enhanced the `dev/docker/gravitino` Docker image to support environment
variable injection for `gravitino.conf` configuration. This change
aligns the Gravitino server image behavior with the existing design
pattern used by `iceberg-rest-server`.
- Added a new `rewrite_config.py` script that reads predefined
environment variables and updates `gravitino.conf` at container startup.
- Added `start-gravitino.sh` to execute `rewrite_config.py` before
launching the Gravitino server.
### Why are the changes needed?
Fixes #7119
### Does this PR introduce any user-facing change?
This is an internal improvement to the `dev/docker/gravitino` image and
does not affect Gravitino core functionality.
### How was this patch tested?
- Ran `./gradlew clean build`.
- Executed `dev/docker/gravitino/gravitino-dependency.sh` to verify the
dependency script works properly and built the Gravitino image.
- Launched a basic container and confirmed the Gravitino server started
successfully:
```bash
./dev/docker/build-docker.sh --platform all --type gravitino --image
kylelin0927/gravitino --tag test-rewrite-config
```
Verified environment variable injection for key settings:
- Test `GRAVITINO_SERVER_WEBSERVER_HTTP_PORT`: Server was reachable on
port 8080.
```bash
docker run --rm -d \
-e GRAVITINO_SERVER_WEBSERVER_HTTP_PORT=8080 \
-p 8080:8080 \
kylelin0927/gravitino:test-rewrite-config
```
- Test `GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_URL` and
`GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_DRIVER`: Server started
successfully without error logs.
```bash
docker run --rm -d \
-e
GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_URL="jdbc:postgresql://localhost:5432/database1"
\
-e GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_DRIVER="org.postgresql.Driver"
\
-p 8090:8090 \
kylelin0927/gravitino:test-rewrite-config
```
- Tested `GRAVITINO_SERVER_WEBSERVER_MIN_THREADS` and
`GRAVITINO_SERVER_WEBSERVER_MAX_THREADS`: verified that
`/root/gravitino/conf/gravitino.conf` contains the correct values using
cat and grep.
```bash
docker run -d --rm --name gravitino-test \
-e GRAVITINO_SERVER_WEBSERVER_MIN_THREADS=10 \
-e GRAVITINO_SERVER_WEBSERVER_MAX_THREADS=50 \
kylelin0927/gravitino:test-rewrite-config
```
---
dev/docker/gravitino/Dockerfile | 4 +-
dev/docker/gravitino/gravitino-dependency.sh | 4 +
.../gravitino/rewrite_gravitino_server_config.py | 112 +++++++++++++++++++++
.../gravitino/{Dockerfile => start-gravitino.sh} | 14 +--
docs/gravitino-server-config.md | 83 ++++++++++++++-
5 files changed, 208 insertions(+), 9 deletions(-)
diff --git a/dev/docker/gravitino/Dockerfile b/dev/docker/gravitino/Dockerfile
index 9f2ce7e90c..cf10e54d43 100644
--- a/dev/docker/gravitino/Dockerfile
+++ b/dev/docker/gravitino/Dockerfile
@@ -26,4 +26,6 @@ COPY packages/gravitino /root/gravitino
EXPOSE 8090
EXPOSE 9001
-ENTRYPOINT ["/bin/bash", "/root/gravitino/bin/gravitino.sh", "start"]
+RUN chmod +x /root/gravitino/bin/start-gravitino.sh
+
+ENTRYPOINT ["/bin/bash", "/root/gravitino/bin/start-gravitino.sh"]
diff --git a/dev/docker/gravitino/gravitino-dependency.sh
b/dev/docker/gravitino/gravitino-dependency.sh
index b24d1f18e9..3393658c96 100755
--- a/dev/docker/gravitino/gravitino-dependency.sh
+++ b/dev/docker/gravitino/gravitino-dependency.sh
@@ -35,6 +35,10 @@ mkdir -p "${gravitino_dir}/packages"
cp -r "${gravitino_home}/distribution/package"
"${gravitino_dir}/packages/gravitino"
+mkdir -p "${gravitino_dir}/packages/gravitino/bin"
+cp "${gravitino_dir}/rewrite_gravitino_server_config.py"
"${gravitino_dir}/packages/gravitino/bin/"
+cp "${gravitino_dir}/start-gravitino.sh"
"${gravitino_dir}/packages/gravitino/bin/"
+
# Copy the Aliyun, AWS, GCP and Azure bundles to the Hadoop catalog libs
cp ${gravitino_home}/bundles/aliyun-bundle/build/libs/*.jar
"${gravitino_dir}/packages/gravitino/catalogs/hadoop/libs"
cp ${gravitino_home}/bundles/aws-bundle/build/libs/*.jar
"${gravitino_dir}/packages/gravitino/catalogs/hadoop/libs"
diff --git a/dev/docker/gravitino/rewrite_gravitino_server_config.py
b/dev/docker/gravitino/rewrite_gravitino_server_config.py
new file mode 100755
index 0000000000..3094e8461a
--- /dev/null
+++ b/dev/docker/gravitino/rewrite_gravitino_server_config.py
@@ -0,0 +1,112 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+
+env_map = {
+ "GRAVITINO_SERVER_SHUTDOWN_TIMEOUT": "server.shutdown.timeout",
+ "GRAVITINO_SERVER_WEBSERVER_HOST": "server.webserver.host",
+ "GRAVITINO_SERVER_WEBSERVER_HTTP_PORT": "server.webserver.httpPort",
+ "GRAVITINO_SERVER_WEBSERVER_MIN_THREADS": "server.webserver.minThreads",
+ "GRAVITINO_SERVER_WEBSERVER_MAX_THREADS": "server.webserver.maxThreads",
+ "GRAVITINO_SERVER_WEBSERVER_STOP_TIMEOUT": "server.webserver.stopTimeout",
+ "GRAVITINO_SERVER_WEBSERVER_IDLE_TIMEOUT": "server.webserver.idleTimeout",
+ "GRAVITINO_SERVER_WEBSERVER_THREAD_POOL_WORK_QUEUE_SIZE":
"server.webserver.threadPoolWorkQueueSize",
+ "GRAVITINO_SERVER_WEBSERVER_REQUEST_HEADER_SIZE":
"server.webserver.requestHeaderSize",
+ "GRAVITINO_SERVER_WEBSERVER_RESPONSE_HEADER_SIZE":
"server.webserver.responseHeaderSize",
+ "GRAVITINO_ENTITY_STORE": "entity.store",
+ "GRAVITINO_ENTITY_STORE_RELATIONAL": "entity.store.relational",
+ "GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_URL":
"entity.store.relational.jdbcUrl",
+ "GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_DRIVER":
"entity.store.relational.jdbcDriver",
+ "GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_USER":
"entity.store.relational.jdbcUser",
+ "GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_PASSWORD":
"entity.store.relational.jdbcPassword",
+ "GRAVITINO_CATALOG_CACHE_EVICTION_INTERVAL_MS":
"catalog.cache.evictionIntervalMs",
+ "GRAVITINO_AUTHORIZATION_ENABLE": "authorization.enable",
+ "GRAVITINO_AUTHORIZATION_SERVICE_ADMINS": "authorization.serviceAdmins",
+ "GRAVITINO_AUX_SERVICE_NAMES": "auxService.names",
+ "GRAVITINO_ICEBERG_REST_CLASSPATH": "iceberg-rest.classpath",
+ "GRAVITINO_ICEBERG_REST_HOST": "iceberg-rest.host",
+ "GRAVITINO_ICEBERG_REST_HTTP_PORT": "iceberg-rest.httpPort",
+ "GRAVITINO_ICEBERG_REST_CATALOG_BACKEND": "iceberg-rest.catalog-backend",
+ "GRAVITINO_ICEBERG_REST_WAREHOUSE": "iceberg-rest.warehouse"
+}
+
+init_config = {
+ "server.shutdown.timeout": "3000",
+ "server.webserver.host": "0.0.0.0",
+ "server.webserver.httpPort": "8090",
+ "server.webserver.minThreads": "24",
+ "server.webserver.maxThreads": "200",
+ "server.webserver.stopTimeout": "30000",
+ "server.webserver.idleTimeout": "30000",
+ "server.webserver.threadPoolWorkQueueSize": "100",
+ "server.webserver.requestHeaderSize": "131072",
+ "server.webserver.responseHeaderSize": "131072",
+ "entity.store": "relational",
+ "entity.store.relational": "JDBCBackend",
+ "entity.store.relational.jdbcUrl": "jdbc:h2",
+ "entity.store.relational.jdbcDriver": "org.h2.Driver",
+ "entity.store.relational.jdbcUser": "gravitino",
+ "entity.store.relational.jdbcPassword": "gravitino",
+ "catalog.cache.evictionIntervalMs": "3600000",
+ "authorization.enable": "false",
+ "authorization.serviceAdmins": "anonymous",
+ "auxService.names": "iceberg-rest",
+ "iceberg-rest.classpath": "iceberg-rest-server/libs,
iceberg-rest-server/conf",
+ "iceberg-rest.host": "0.0.0.0",
+ "iceberg-rest.httpPort": "9001",
+ "iceberg-rest.catalog-backend": "memory",
+ "iceberg-rest.warehouse": "/tmp/"
+}
+
+def parse_config_file(file_path):
+ config_map = {}
+ with open(file_path, "r") as file:
+ for line in file:
+ stripped_line = line.strip()
+ if stripped_line and not stripped_line.startswith("#"):
+ key, value = stripped_line.split("=", 1)
+ key = key.strip()
+ value = value.strip()
+ config_map[key] = value
+ return config_map
+
+
+config_prefix = "gravitino."
+
+
+def update_config(config, key, value):
+ config[config_prefix + key] = value
+
+
+config_file_path = "conf/gravitino.conf"
+config_map = parse_config_file(config_file_path)
+
+for k, v in init_config.items():
+ update_config(config_map, k, v)
+
+for k, v in env_map.items():
+ if k in os.environ:
+ update_config(config_map, v, os.environ[k])
+
+if os.path.exists(config_file_path):
+ os.remove(config_file_path)
+
+with open(config_file_path, "w") as file:
+ for key, value in config_map.items():
+ line = "{} = {}\n".format(key, value)
+ file.write(line)
\ No newline at end of file
diff --git a/dev/docker/gravitino/Dockerfile
b/dev/docker/gravitino/start-gravitino.sh
old mode 100644
new mode 100755
similarity index 77%
copy from dev/docker/gravitino/Dockerfile
copy to dev/docker/gravitino/start-gravitino.sh
index 9f2ce7e90c..1cc612348c
--- a/dev/docker/gravitino/Dockerfile
+++ b/dev/docker/gravitino/start-gravitino.sh
@@ -1,3 +1,4 @@
+#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
@@ -16,14 +17,13 @@
# specific language governing permissions and limitations
# under the License.
#
-FROM openjdk:17-jdk-buster
-LABEL maintainer="[email protected]"
-WORKDIR /root/gravitino
+set -ex
+bin_dir="$(dirname "${BASH_SOURCE-$0}")"
+gravitino_dir="$(cd "${bin_dir}/../">/dev/null; pwd)"
-COPY packages/gravitino /root/gravitino
+cd ${gravitino_dir}
-EXPOSE 8090
-EXPOSE 9001
+python bin/rewrite_gravitino_server_config.py
-ENTRYPOINT ["/bin/bash", "/root/gravitino/bin/gravitino.sh", "start"]
+./bin/gravitino.sh start
\ No newline at end of file
diff --git a/docs/gravitino-server-config.md b/docs/gravitino-server-config.md
index 26be0f4ff9..ab519966ee 100644
--- a/docs/gravitino-server-config.md
+++ b/docs/gravitino-server-config.md
@@ -257,9 +257,90 @@ The Gravitino server automatically adds the catalog
properties configuration dir
You could put HDFS configuration file to the catalog properties configuration
dir, like `catalogs/lakehouse-iceberg/conf/`.
+## Docker instructions
+
+You could run Gravitino server though docker container:
+
+```shell
+docker run -d -p 8090:8090 apache/gravitino:latest
+```
+
+The Gravitino Docker image supports injecting configuration values via
environment variables by translating them to corresponding entries in
`gravitino.conf` at container startup.
+
+This is done using a startup script that parses environment variables prefixed
with `GRAVITINO_` and rewrites the configuration file accordingly.
+
+These variables override the corresponding entries in `gravitino.conf` at
startup.
+
+| Environment Variable |
Configuration Key | Default Value
| Since Version |
+|---------------------------------------------------------------|----------------------------------------------------------|------------------------------------------------------------|------------------------|
+| `GRAVITINO_SERVER_SHUTDOWN_TIMEOUT` |
`gravitino.server.shutdown.timeout` | `3000`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_HOST` |
`gravitino.server.webserver.host` | `0.0.0.0`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_HTTP_PORT` |
`gravitino.server.webserver.httpPort` | `8090`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_MIN_THREADS` |
`gravitino.server.webserver.minThreads` | `24`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_MAX_THREADS` |
`gravitino.server.webserver.maxThreads` | `200`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_STOP_TIMEOUT` |
`gravitino.server.webserver.stopTimeout` | `30000`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_IDLE_TIMEOUT` |
`gravitino.server.webserver.idleTimeout` | `30000`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_THREAD_POOL_WORK_QUEUE_SIZE` |
`gravitino.server.webserver.threadPoolWorkQueueSize` | `100`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_REQUEST_HEADER_SIZE` |
`gravitino.server.webserver.requestHeaderSize` | `131072`
| 0.10.0-incubating |
+| `GRAVITINO_SERVER_WEBSERVER_RESPONSE_HEADER_SIZE` |
`gravitino.server.webserver.responseHeaderSize` | `131072`
| 0.10.0-incubating |
+| `GRAVITINO_ENTITY_STORE` |
`gravitino.entity.store` | `relational`
| 0.10.0-incubating |
+| `GRAVITINO_ENTITY_STORE_RELATIONAL` |
`gravitino.entity.store.relational` | `JDBCBackend`
| 0.10.0-incubating |
+| `GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_URL` |
`gravitino.entity.store.relational.jdbcUrl` | `jdbc:h2`
| 0.10.0-incubating |
+| `GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_DRIVER` |
`gravitino.entity.store.relational.jdbcDriver` | `org.h2.Driver`
| 0.10.0-incubating |
+| `GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_USER` |
`gravitino.entity.store.relational.jdbcUser` | `gravitino`
| 0.10.0-incubating |
+| `GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_PASSWORD` |
`gravitino.entity.store.relational.jdbcPassword` | `gravitino`
| 0.10.0-incubating |
+| `GRAVITINO_CATALOG_CACHE_EVICTION_INTERVAL_MS` |
`gravitino.catalog.cache.evictionIntervalMs` | `3600000`
| 0.10.0-incubating |
+| `GRAVITINO_AUTHORIZATION_ENABLE` |
`gravitino.authorization.enable` | `false`
| 0.10.0-incubating |
+| `GRAVITINO_AUTHORIZATION_SERVICE_ADMINS` |
`gravitino.authorization.serviceAdmins` | `anonymous`
| 0.10.0-incubating |
+| `GRAVITINO_AUX_SERVICE_NAMES` |
`gravitino.auxService.names` | `iceberg-rest`
| 0.10.0-incubating |
+| `GRAVITINO_ICEBERG_REST_CLASSPATH` |
`gravitino.iceberg-rest.classpath` |
`iceberg-rest-server/libs, iceberg-rest-server/conf` | 0.10.0-incubating
|
+| `GRAVITINO_ICEBERG_REST_HOST` |
`gravitino.iceberg-rest.host` | `0.0.0.0`
| 0.10.0-incubating |
+| `GRAVITINO_ICEBERG_REST_HTTP_PORT` |
`gravitino.iceberg-rest.httpPort` | `9001`
| 0.10.0-incubating |
+| `GRAVITINO_ICEBERG_REST_CATALOG_BACKEND` |
`gravitino.iceberg-rest.catalog-backend` | `memory`
| 0.10.0-incubating |
+| `GRAVITINO_ICEBERG_REST_WAREHOUSE` |
`gravitino.iceberg-rest.warehouse` | `/tmp/`
| 0.10.0-incubating |
+
+:::note
+This feature is supported in the Gravitino Docker image starting from version
`0.10.0-incubating`.
+:::
+
+Usage Example:
+
+To start a container and override the default HTTP port:
+
+```shell
+docker run --rm -d \
+ -e GRAVITINO_SERVER_WEBSERVER_HTTP_PORT=8080 \
+ -p 8080:8080 \
+ apache/gravitino:<tag>
+```
+
+To configure JDBC backend with PostgreSQL:
+
+```shell
+docker run --rm -d \
+ -e
GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_URL="jdbc:postgresql://localhost:5432/database1"
\
+ -e GRAVITINO_ENTITY_STORE_RELATIONAL_JDBC_DRIVER="org.postgresql.Driver" \
+ -p 8090:8090 \
+ apache/gravitino:<tag>
+```
+
+You can verify that the configuration was applied correctly by inspecting the
container's `gravitino.conf`:
+
+```shell
+docker exec -it <container_id> cat /root/gravitino/conf/gravitino.conf
+```
+
+:::note
+If both `gravitino.conf` and environment variable exist, the container’s
startup script will overwrite the config file value with the environment
variable.
+:::
+
+
## How to set up runtime environment variables
-The Gravitino server lets you set up runtime environment variables by editing
the `gravitino-env.sh` file, located in the `conf` directory.
+The Gravitino server supports configuring runtime environment variables in two
ways:
+
+1. **Local deployment:** Modify `gravitino-env.sh` located in the `conf`
directory.
+2. **Docker container deployment:** Use environment variable injection during
container startup. *(Since 0.10.0-incubating)*
### How to access Apache Hadoop