[ 
https://issues.apache.org/jira/browse/HIVE-27382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18078415#comment-18078415
 ] 

Stamatis Zampetakis commented on HIVE-27382:
--------------------------------------------

The main issue that was bothering me for contributing the 
postgres-tpcds-metastore image to the Hive repository was the dependency of the 
Dockerfile to a large binary database dump (~145MB compressed) that was COPYied 
from a local directory. Based on some recent discussions in the dev list I 
revisited the initial design to somewhat separate sources (Dockerfile) and 
artifacts (DB dumps). 

The documentation for the artifacts (DB dumps) has been moved to a new repo, 
i.e., https://github.com/zabetak/hive-test-datasets, and dump files themselves 
are attached as release assets. We could possibly place the dumps in S3, Google 
Cloud, or another storage, but there is no strong reason to do so. Moreover, 
this artifact publication scheme could easily transfer to Apache where we could 
have https://github.com/apache/hive-test-datasets without too much setup or 
special INFRA requirements.

The Dockerfile that resides in 
https://github.com/zabetak/hive-postgres-metastore is now minimal and can be 
moved to the main Hive repo with very small effort.

> RDMBS docker images: contribute Dockerfile to hive repo - 
> postgres-tpcds-metastore
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-27382
>                 URL: https://issues.apache.org/jira/browse/HIVE-27382
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Testing Infrastructure
>            Reporter: László Bodor
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> This is a technical debt to be solved, we use images like:
> {code}
> grep -iRH "getDockerImageName" -A 1
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/PostgresTPCDS.java:
>   public String getDockerImageName() {
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/PostgresTPCDS.java-
>     return "zabetak/postgres-tpcds-metastore:1.3";
> --
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Oracle.java:
>   public String getDockerImageName() {
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Oracle.java-
>     return "abstractdog/oracle-xe:18.4.0-slim";
> {code}
> oops, abstractdog is mine, anyway, so for instance, "-slim" image is used for 
> a reason, and if there is no official slim image, it's fine to use this one 
> but needs to be contributed to hive, like:
> 1. Dockerfile
> 2. build instructions
> 3. image to hive related docker registry



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to