This is an automated email from the ASF dual-hosted git repository.
fanng pushed a commit to branch branch-0.7
in repository https://gitbox.apache.org/repos/asf/gravitino.git
The following commit(s) were added to refs/heads/branch-0.7 by this push:
new de27d58df [#4551] feat(iceberg): add S3 and GCS support for
IcebergRESTService docker image (#5377)
de27d58df is described below
commit de27d58dfd21aa7eb49ab30080890adc921ae097
Author: github-actions[bot]
<41898282+github-actions[bot]@users.noreply.github.com>
AuthorDate: Wed Oct 30 22:56:01 2024 +0800
[#4551] feat(iceberg): add S3 and GCS support for IcebergRESTService
docker image (#5377)
### What changes were proposed in this pull request?
1. add AWS and GCP bundle jar to IcebergRESTServer docker image
2. use environment variable to change the config
### Why are the changes needed?
Fix: #4551
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
run SQL with access S3 and GCS data
Co-authored-by: FANNG <[email protected]>
---
bundles/gcp-bundle/build.gradle.kts | 1 +
dev/docker/iceberg-rest-server/Dockerfile | 2 +-
.../iceberg-rest-server-dependency.sh | 28 ++++++++
dev/docker/iceberg-rest-server/rewrite_config.py | 78 ++++++++++++++++++++++
.../{Dockerfile => start-iceberg-rest-server.sh} | 14 ++--
docs/docker-image-details.md | 8 ++-
docs/iceberg-rest-service.md | 21 +++++-
7 files changed, 140 insertions(+), 12 deletions(-)
diff --git a/bundles/gcp-bundle/build.gradle.kts
b/bundles/gcp-bundle/build.gradle.kts
index b887ef2c5..4ff29b845 100644
--- a/bundles/gcp-bundle/build.gradle.kts
+++ b/bundles/gcp-bundle/build.gradle.kts
@@ -49,6 +49,7 @@ tasks.withType(ShadowJar::class.java) {
relocate("org.apache.httpcomponents",
"org.apache.gravitino.shaded.org.apache.httpcomponents")
relocate("org.apache.commons",
"org.apache.gravitino.shaded.org.apache.commons")
relocate("com.google", "org.apache.gravitino.shaded.com.google")
+ relocate("com.fasterxml", "org.apache.gravitino.shaded.com.fasterxml")
}
tasks.jar {
diff --git a/dev/docker/iceberg-rest-server/Dockerfile
b/dev/docker/iceberg-rest-server/Dockerfile
index d4a85915f..eae94c4a4 100644
--- a/dev/docker/iceberg-rest-server/Dockerfile
+++ b/dev/docker/iceberg-rest-server/Dockerfile
@@ -26,4 +26,4 @@ COPY packages/gravitino-iceberg-rest-server
/root/gravitino-iceberg-rest-server
EXPOSE 9001
-ENTRYPOINT ["/bin/bash",
"/root/gravitino-iceberg-rest-server/bin/gravitino-iceberg-rest-server.sh",
"start"]
+ENTRYPOINT ["/bin/bash",
"/root/gravitino-iceberg-rest-server/bin/start-iceberg-rest-server.sh"]
diff --git a/dev/docker/iceberg-rest-server/iceberg-rest-server-dependency.sh
b/dev/docker/iceberg-rest-server/iceberg-rest-server-dependency.sh
index 8581cc5be..5d0015786 100755
--- a/dev/docker/iceberg-rest-server/iceberg-rest-server-dependency.sh
+++ b/dev/docker/iceberg-rest-server/iceberg-rest-server-dependency.sh
@@ -34,6 +34,34 @@ cd distribution
tar xfz gravitino-iceberg-rest-server-*.tar.gz
cp -r gravitino-iceberg-rest-server*-bin
${iceberg_rest_server_dir}/packages/gravitino-iceberg-rest-server
+cd ${gravitino_home}
+./gradlew :bundles:gcp-bundle:jar
+./gradlew :bundles:aws-bundle:jar
+
+# prepare bundle jar
+cd ${iceberg_rest_server_dir}
+mkdir -p bundles
+cp ${gravitino_home}/bundles/gcp-bundle/build/libs/gravitino-gcp-bundle-*.jar
bundles/
+cp ${gravitino_home}/bundles/aws-bundle/build/libs/gravitino-aws-bundle-*.jar
bundles/
+
+iceberg_gcp_bundle="iceberg-gcp-bundle-1.5.2.jar"
+if [ ! -f "bundles/${iceberg_gcp_bundle}" ]; then
+ curl -L -s -o bundles/${iceberg_gcp_bundle}
https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-gcp-bundle/1.5.2/${iceberg_gcp_bundle}
+fi
+
+iceberg_aws_bundle="iceberg-aws-bundle-1.5.2.jar"
+if [ ! -f "bundles/${iceberg_aws_bundle}" ]; then
+ curl -L -s -o bundles/${iceberg_aws_bundle}
https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-aws-bundle/1.5.2/${iceberg_aws_bundle}
+fi
+
+# download jdbc driver
+curl -L -s -o bundles/sqlite-jdbc-3.42.0.0.jar
https://repo1.maven.org/maven2/org/xerial/sqlite-jdbc/3.42.0.0/sqlite-jdbc-3.42.0.0.jar
+
+cp bundles/*jar
${iceberg_rest_server_dir}/packages/gravitino-iceberg-rest-server/libs/
+
+cp start-iceberg-rest-server.sh
${iceberg_rest_server_dir}/packages/gravitino-iceberg-rest-server/bin/
+cp rewrite_config.py
${iceberg_rest_server_dir}/packages/gravitino-iceberg-rest-server/bin/
+
# Keeping the container running at all times
cat <<EOF >>
"${iceberg_rest_server_dir}/packages/gravitino-iceberg-rest-server/bin/gravitino-iceberg-rest-server.sh"
diff --git a/dev/docker/iceberg-rest-server/rewrite_config.py
b/dev/docker/iceberg-rest-server/rewrite_config.py
new file mode 100755
index 000000000..dce5479cf
--- /dev/null
+++ b/dev/docker/iceberg-rest-server/rewrite_config.py
@@ -0,0 +1,78 @@
+#!/usr/bin/env
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+
+env_map = {
+ "GRAVITINO_IO_IMPL" : "io-impl",
+ "GRAVITINO_URI" : "uri",
+ "GRAVITINO_WAREHOUSE" : "warehouse",
+ "GRAVITINO_CREDENTIAL_PROVIDER_TYPE" : "credential-provider-type",
+ "GRAVITINO_GCS_CREDENTIAL_FILE_PATH" : "gcs-credential-file-path",
+ "GRAVITINO_S3_ACCESS_KEY" : "s3-access-key-id",
+ "GRAVITINO_S3_SECRET_KEY" : "s3-secret-access-key",
+ "GRAVITINO_S3_REGION" : "s3-region",
+ "GRAVITINO_S3_ROLE_ARN" : "s3-role-arn",
+ "GRAVITINO_S3_EXTERNAL_ID" : "s3-external-id"
+}
+
+init_config = {
+ "catalog-backend" : "jdbc",
+ "jdbc-driver" : "org.sqlite.JDBC",
+ "uri" : "jdbc:sqlite::memory:",
+ "jdbc-user" : "iceberg",
+ "jdbc-password" : "iceberg",
+ "jdbc-initialize" : "true",
+ "jdbc.schema-version" : "V1"
+}
+
+
+def parse_config_file(file_path):
+ config_map = {}
+ with open(file_path, 'r') as file:
+ for line in file:
+ stripped_line = line.strip()
+ if stripped_line and not stripped_line.startswith('#'):
+ key, value = stripped_line.split('=')
+ key = key.strip()
+ value = value.strip()
+ config_map[key] = value
+ return config_map
+
+config_prefix = "gravitino.iceberg-rest."
+
+def update_config(config, key, value):
+ config[config_prefix + key] = value
+
+config_file_path = 'conf/gravitino-iceberg-rest-server.conf'
+config_map = parse_config_file(config_file_path)
+
+for k, v in init_config.items():
+ update_config(config_map, k, v)
+
+for k, v in env_map.items():
+ if k in os.environ:
+ update_config(config_map, v, os.environ[k])
+
+if os.path.exists(config_file_path):
+ os.remove(config_file_path)
+
+with open(config_file_path, 'w') as file:
+ for key, value in config_map.items():
+ line = "{} = {}\n".format(key, value)
+ file.write(line)
diff --git a/dev/docker/iceberg-rest-server/Dockerfile
b/dev/docker/iceberg-rest-server/start-iceberg-rest-server.sh
old mode 100644
new mode 100755
similarity index 71%
copy from dev/docker/iceberg-rest-server/Dockerfile
copy to dev/docker/iceberg-rest-server/start-iceberg-rest-server.sh
index d4a85915f..449ed5ebf
--- a/dev/docker/iceberg-rest-server/Dockerfile
+++ b/dev/docker/iceberg-rest-server/start-iceberg-rest-server.sh
@@ -1,3 +1,4 @@
+#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
@@ -16,14 +17,13 @@
# specific language governing permissions and limitations
# under the License.
#
-FROM openjdk:17-jdk-buster
-LABEL maintainer="[email protected]"
+set -ex
+bin_dir="$(dirname "${BASH_SOURCE-$0}")"
+iceberg_rest_server_dir="$(cd "${bin_dir}/../">/dev/null; pwd)"
-WORKDIR /root/gravitino-iceberg-rest-server
+cd ${iceberg_rest_server_dir}
-COPY packages/gravitino-iceberg-rest-server /root/gravitino-iceberg-rest-server
+python bin/rewrite_config.py
-EXPOSE 9001
-
-ENTRYPOINT ["/bin/bash",
"/root/gravitino-iceberg-rest-server/bin/gravitino-iceberg-rest-server.sh",
"start"]
+./bin/gravitino-iceberg-rest-server.sh start
diff --git a/docs/docker-image-details.md b/docs/docker-image-details.md
index cad304657..5344d656c 100644
--- a/docs/docker-image-details.md
+++ b/docs/docker-image-details.md
@@ -51,11 +51,17 @@ You can deploy the standalone Gravitino Iceberg REST server
with the Docker imag
Container startup commands
```shell
-docker run --rm -d -p 9001:9001 apache/gravitino-iceberg-rest:0.6.1-incubating
+docker run --rm -d -p 9001:9001 apache/gravitino-iceberg-rest:0.7.0-incubating
```
Changelog
+- apache/gravitino-iceberg-rest:0.7.0-incubating
+ - Using JDBC catalog backend.
+ - Supports S3 and GCS storage.
+ - Supports credential vending.
+ - Supports changing configuration by environment variables.
+
- apache/gravitino-iceberg-rest:0.6.1-incubating
- Based on Gravitino 0.6.1-incubating, you can know more information from
0.6.1-incubating release notes.
diff --git a/docs/iceberg-rest-service.md b/docs/iceberg-rest-service.md
index 0fd670c42..1f92240bc 100644
--- a/docs/iceberg-rest-service.md
+++ b/docs/iceberg-rest-service.md
@@ -399,13 +399,28 @@ SELECT * FROM dml.test;
You could run Gravitino Iceberg REST server though docker container:
```shell
-docker run -d -p 9001:9001 apache/gravitino-iceberg-rest:0.6.0
+docker run -d -p 9001:9001 apache/gravitino-iceberg-rest:0.7.0-incubating
```
-Or build it manually to add custom logics:
+Gravitino Iceberg REST server in docker image could access local storage by
default, you could set the following environment variables if the storage is
cloud/remote storage like S3, please refer to [storage section](#storage) for
more details.
+
+| Environment variables | Configuration items
| Since version |
+|--------------------------------------|---------------------------------------------------|-------------------|
+| `GRAVITINO_IO_IMPL` | `gravitino.iceberg-rest.io-impl`
| 0.7.0-incubating |
+| `GRAVITINO_URI` | `gravitino.iceberg-rest.uri`
| 0.7.0-incubating |
+| `GRAVITINO_WAREHOUSE` | `gravitino.iceberg-rest.warehouse`
| 0.7.0-incubating |
+| `GRAVITINO_CREDENTIAL_PROVIDER_TYPE` |
`gravitino.iceberg-rest.credential-provider-type` | 0.7.0-incubating |
+| `GRAVITINO_GCS_CREDENTIAL_FILE_PATH` |
`gravitino.iceberg-rest.gcs-credential-file-path` | 0.7.0-incubating |
+| `GRAVITINO_S3_ACCESS_KEY` |
`gravitino.iceberg-rest.s3-access-key-id` | 0.7.0-incubating |
+| `GRAVITINO_S3_SECRET_KEY` |
`gravitino.iceberg-rest.s3-secret-access-key` | 0.7.0-incubating |
+| `GRAVITINO_S3_REGION` | `gravitino.iceberg-rest.s3-region`
| 0.7.0-incubating |
+| `GRAVITINO_S3_ROLE_ARN` | `gravitino.iceberg-rest.s3-role-arn`
| 0.7.0-incubating |
+| `GRAVITINO_S3_EXTERNAL_ID` |
`gravitino.iceberg-rest.s3-external-id` | 0.7.0-incubating |
+
+Or build it manually to add custom configuration or logics:
```shell
-sh ./dev/docker/build-docker.sh --platform linux/arm64 --type
iceberg-rest-server --image apache/gravitino-iceberg-rest --tag 0.6.0
+sh ./dev/docker/build-docker.sh --platform linux/arm64 --type
iceberg-rest-server --image apache/gravitino-iceberg-rest --tag 0.7.0-incubating
```
You could try Spark with Gravitino REST catalog service in our
[playground](./how-to-use-the-playground.md#using-apache-iceberg-rest-service).