[
https://issues.apache.org/jira/browse/HIVE-27382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18078415#comment-18078415
]
Stamatis Zampetakis commented on HIVE-27382:
--------------------------------------------
The main issue that was bothering me for contributing the
postgres-tpcds-metastore image to the Hive repository was the dependency of the
Dockerfile to a large binary database dump (~145MB compressed) that was COPYied
from a local directory. Based on some recent discussions in the dev list I
revisited the initial design to somewhat separate sources (Dockerfile) and
artifacts (DB dumps).
The documentation for the artifacts (DB dumps) has been moved to a new repo,
i.e., https://github.com/zabetak/hive-test-datasets, and dump files themselves
are attached as release assets. We could possibly place the dumps in S3, Google
Cloud, or another storage, but there is no strong reason to do so. Moreover,
this artifact publication scheme could easily transfer to Apache where we could
have https://github.com/apache/hive-test-datasets without too much setup or
special INFRA requirements.
The Dockerfile that resides in
https://github.com/zabetak/hive-postgres-metastore is now minimal and can be
moved to the main Hive repo with very small effort.
> RDMBS docker images: contribute Dockerfile to hive repo -
> postgres-tpcds-metastore
> ----------------------------------------------------------------------------------
>
> Key: HIVE-27382
> URL: https://issues.apache.org/jira/browse/HIVE-27382
> Project: Hive
> Issue Type: Sub-task
> Components: Testing Infrastructure
> Reporter: László Bodor
> Assignee: Stamatis Zampetakis
> Priority: Major
>
> This is a technical debt to be solved, we use images like:
> {code}
> grep -iRH "getDockerImageName" -A 1
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/PostgresTPCDS.java:
> public String getDockerImageName() {
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/PostgresTPCDS.java-
> return "zabetak/postgres-tpcds-metastore:1.3";
> --
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Oracle.java:
> public String getDockerImageName() {
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Oracle.java-
> return "abstractdog/oracle-xe:18.4.0-slim";
> {code}
> oops, abstractdog is mine, anyway, so for instance, "-slim" image is used for
> a reason, and if there is no official slim image, it's fine to use this one
> but needs to be contributed to hive, like:
> 1. Dockerfile
> 2. build instructions
> 3. image to hive related docker registry
--
This message was sent by Atlassian Jira
(v8.20.10#820010)