linghengqian commented on code in PR #5629: URL: https://github.com/apache/hive/pull/5629#discussion_r1947885715
########## packaging/src/docker/README.md: ########## @@ -210,3 +210,61 @@ docker compose down select count(distinct a) from hive_example; select sum(b) from hive_example; ``` + +#### `sys` Schema and `information_schema` Schema + +`Hive Schema Tool` is located in the Docker Image at `/opt/hive/bin/schematool`. + +By default, system schemas such as `information_schema` for HiveServer2 are not created. +To create system schemas for a HiveServer2 instance, +users need to configure HiveServer2 to use a remote Hive Metastore Server and use a database other than embedded Derby for the Hive Metastore Server. + +Assuming `Maven` and `Docker CE` are installed, a possible use case is as follows. +Create a `compose.yaml` file in the current directory, + +```yaml +services: + some-postgres: + image: postgres:17.2-bookworm + environment: + POSTGRES_PASSWORD: "example" + metastore-standalone: + image: apache/hive:4.0.1 + depends_on: + - some-postgres + environment: + SERVICE_NAME: metastore + DB_DRIVER: postgres + SERVICE_OPTS: >- + -Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver + -Djavax.jdo.option.ConnectionURL=jdbc:postgresql://some-postgres:5432/postgres + -Djavax.jdo.option.ConnectionUserName=postgres + -Djavax.jdo.option.ConnectionPassword=example + volumes: + - ~/.m2/repository/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar:/opt/hive/lib/postgres.jar + hiveserver2-standalone: + image: apache/hive:4.0.1 + depends_on: + - metastore-standalone + environment: + SERVICE_NAME: hiveserver2 + IS_RESUME: true + SERVICE_OPTS: >- + -Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver + -Djavax.jdo.option.ConnectionURL=jdbc:postgresql://some-postgres:5432/postgres + -Djavax.jdo.option.ConnectionUserName=postgres + -Djavax.jdo.option.ConnectionPassword=example + -Dhive.metastore.uris=thrift://metastore-standalone:9083 + volumes: + - ~/.m2/repository/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar:/opt/hive/lib/postgres.jar +``` + +Then execute the shell command as follows to initialize the system schemas in HiveServer2. + +```shell +mvn dependency:get -Dartifact=org.postgresql:postgresql:42.7.5 +docker compose up -d +docker compose exec hiveserver2-standalone /bin/bash +/opt/hive/bin/schematool -initSchema -dbType hive -metaDbType postgres -url jdbc:hive2://localhost:10000/default +exit +``` Review Comment: @dengzhhu653 - There is still a question in the current PR: Should I require users to install `Maven` in advance through `SDKMAN!`? The unit test I wrote in https://github.com/linghengqian/hive-server2-jdbc-driver/pull/23 prefers to use `Dockerfile` to dynamically create Docker Image, but the Hive documentation seems to like to assume that users know how to use `Maven`. ```dockerfile FROM alpine:3.21.2 AS prepare RUN apk add --no-cache wget RUN wget https://repo1.maven.org/maven2/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar --directory-prefix=/opt/hive/lib RUN wget https://repo1.maven.org/maven2/org/checkerframework/checker-qual/3.48.3/checker-qual-3.48.3.jar --directory-prefix=/opt/hive/lib FROM apache/hive:4.0.1 COPY --from=prepare /opt/hive/lib /opt/hive/lib ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org