This is an automated email from the ASF dual-hosted git repository.
mchades pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino-playground.git
The following commit(s) were added to refs/heads/main by this push:
new 75b6fae [#139] Improvement: externalize Gravitino metadata and
Jupyter notebooks outside of containers (#140)
75b6fae is described below
commit 75b6fae9ebddab439479dac43cd95d40a4f65d22
Author: Shaofeng Shi <[email protected]>
AuthorDate: Mon Apr 7 17:26:22 2025 +0800
[#139] Improvement: externalize Gravitino metadata and Jupyter notebooks
outside of containers (#140)
### What changes were proposed in this pull request?
Mount a local folder to the Gravitino and Jupyter container, which
persistents the metadata of Gravitino (h2 db file by default) and the
notebooks of Jupyter. So that they won't be lost when the containers be
re-created.
### Why are the changes needed?
No changes for other part.
If user do want to cleanup the metadata, they need manually clean up the
"data/gravitino" or "data/jupyter" folder under the playground folder.
Fix: #139
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
I tested it in my local. When I stop, drop the container, and then
restart, previous metadata of Gravitino and Jupyter were kept.
---
.gitignore | 1 +
docker-compose.yaml | 2 ++
init/gravitino/init.sh | 9 ++++++++-
init/jupyter/init.sh | 12 +++++++++---
4 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/.gitignore b/.gitignore
index 2c33a8d..cd652b1 100644
--- a/.gitignore
+++ b/.gitignore
@@ -2,3 +2,4 @@
**/.DS_Store
**/packages
**/*.log
+data/
diff --git a/docker-compose.yaml b/docker-compose.yaml
index 9002819..83569f9 100644
--- a/docker-compose.yaml
+++ b/docker-compose.yaml
@@ -70,6 +70,7 @@ services:
volumes:
- ./healthcheck:/tmp/healthcheck
- ./init/gravitino:/tmp/gravitino
+ - ./data/gravitino/db:/root/gravitino/data
healthcheck:
test: ["CMD", "/tmp/healthcheck/gravitino-healthcheck.sh"]
interval: 5s
@@ -168,6 +169,7 @@ services:
- "18888:8888"
volumes:
- ./init/jupyter:/tmp/gravitino
+ - ./data/jupyter/data:/home/jovyan
entrypoint: /bin/bash /tmp/gravitino/init.sh
depends_on:
hive:
diff --git a/init/gravitino/init.sh b/init/gravitino/init.sh
index 2d5a850..b95d845 100644
--- a/init/gravitino/init.sh
+++ b/init/gravitino/init.sh
@@ -24,7 +24,14 @@ cp
/root/gravitino/catalogs/jdbc-mysql/libs/mysql-connector-java-8.0.27.jar /roo
cp /root/gravitino/catalogs/jdbc-postgresql/libs/postgresql-42.2.7.jar
/root/gravitino/iceberg-rest-server/libs
cp /root/gravitino/catalogs/jdbc-mysql/libs/mysql-connector-java-8.0.27.jar
/root/gravitino/iceberg-rest-server/libs
-cp /tmp/gravitino/gravitino.conf /root/gravitino/conf
+
+
+if test -e "/root/gravitino/conf/gravitino.conf"; then
+ echo "/root/gravitino/conf/gravitino.conf exists. Skip copying."
+else
+ cp /tmp/gravitino/gravitino.conf /root/gravitino/conf
+fi
+
echo "Finish downloading"
echo "Start the Gravitino Server"
diff --git a/init/jupyter/init.sh b/init/jupyter/init.sh
index d419a5e..ac17bec 100644
--- a/init/jupyter/init.sh
+++ b/init/jupyter/init.sh
@@ -17,10 +17,16 @@
# under the License.
#
-if [ -z "$RANGER_ENABLE" ]; then
- cp -r /tmp/gravitino/*.ipynb /home/jovyan
+if [ -n "$(find /home/jovyan -maxdepth 1 -name "*.ipynb" -print -quit)" ]; then
+ echo "Already have .ipynb files in the directory, skip copying"
else
- cp -r /tmp/gravitino/authorization/*.ipynb /home/jovyan
+ echo "No .ipynb files in the directory, copy the default .ipynb files"
+
+ if [ -z "$RANGER_ENABLE" ]; then
+ cp -r /tmp/gravitino/*.ipynb /home/jovyan
+ else
+ cp -r /tmp/gravitino/authorization/*.ipynb /home/jovyan
+ fi
fi
start-notebook.sh --NotebookApp.token=''