justinmclean commented on code in PR #5726:
URL: https://github.com/apache/gravitino/pull/5726#discussion_r1872219487
##########
docs/how-to-install.md:
##########
@@ -6,145 +6,170 @@ license: "This software is licensed under the Apache
License version 2."
## Install Apache Gravitino from scratch
+### Prerequisites
+
+If you have not installed Java, install it and configure the `JAVA_HOME`
environment variable.
+After installation, you can run the `${JAVA_HOME}/bin/java -version` command
to check the Java version.
+
:::note
-Apache Gravitino supports running on Java 8, 11, and 17. Make sure you have
Java installed and
-`JAVA_HOME` configured correctly. To confirm the Java version, run the
-`${JAVA_HOME}/bin/java -version` command.
+Gravitino supports running on Java 8, 11, and 17.
:::
-The Gravitino package comprises both the Gravitino server and the Gravitino
Iceberg REST server. You can manage these servers independently or run them
concurrently on a single server.
-
-### Get the Apache Gravitino binary distribution package
-
-Before installing Gravitino, make sure you have the Gravitino binary
distribution package. You can download the latest Gravitino binary distribution
package from [GitHub](https://github.com/apache/gravitino/releases).
-You can also build it yourself by following the instructions in [How to Build
Gravitino](./how-to-build.md).
-
- - If you build Gravitino yourself using the `./gradlew compileDistribution`
command, you can find the Gravitino binary distribution package in the
`distribution/package` directory.
-
- - If you build Gravitino yourself using the `./gradlew assembleDistribution`
command, you can get the compressed Gravitino binary distribution package with
the name `gravitino-<version>-bin.tar.gz` in the `distribution` directory with
sha256 checksum file `gravitino-<version>-bin.tar.gz.sha256`.
-
-The Gravitino binary distribution package contains the following files:
-
-```text
-|── ...
-└── distribution/package
- |── bin/
- | ├── gravitino.sh # Gravitino server Launching
scripts.
- | └── gravitino-iceberg-rest-server.sh # Gravitino Iceberg REST
server Launching scripts.
- |── catalogs
- | └── hadoop/ # Apache Hadoop catalog
dependencies and configurations.
- | └── hive/ # Apache Hive catalog
dependencies and configurations.
- | └── jdbc-doris/ # JDBC doris catalog
dependencies and configurations.
- | └── jdbc-mysql/ # JDBC MySQL catalog
dependencies and configurations.
- | └── jdbc-postgresql/ # JDBC PostgreSQL catalog
dependencies and configurations.
- | └── kafka/ # Apache Kafka PostgreSQL
catalog dependencies and configurations.
- | └── lakehouse-iceberg/ # Apache Iceberg catalog
dependencies and configurations.
- | └── lakehouse-paimon/ # Apache Paimon catalog
dependencies and configurations.
- |── conf/ # All configurations for
Gravitino.
- | ├── gravitino.conf # Gravitino server and
Gravitino Iceberg REST server configuration.
- | ├── gravitino-iceberg-rest-server.conf # Gravitino server
configuration.
- | ├── gravitino-env.sh # Environment variables, etc.,
JAVA_HOME, GRAVITINO_HOME, and more.
- | └── log4j2.properties # log4j configuration for the
Gravitino server and Gravitino Iceberg REST server.
- |── libs/ # Gravitino server
dependencies libraries.
- |── logs/ # Gravitino server and
Gravitino Iceberg REST server logs. Automatically created after the server
starts.
- |── data/ # Default directory for the
Gravitino server to store data.
- |── iceberg-rest-server/ # Gravitino Iceberg REST
server package and dependencies libraries.
- └── scripts/ # Extra scripts for Gravitino.
-```
+### Get the Gravitino package
+
+The Gravitino package comprises the Gravitino server and the Gravitino Iceberg
REST server.
+You can manage these servers independently or run them concurrently on a
single server.
+
+1. Download the latest Gravitino package from
[GitHub](https://github.com/apache/gravitino/releases).
+
+ :::note
+ You can also build Gravitino by following the instructions in [How to Build
Gravitino](./how-to-build.md).
+
+ - If you build Gravitino using the `./gradlew compileDistribution` command,
you can find the Gravitino package in the `distribution/package` directory.
-#### Initialize the RDBMS (Optional)
+ - If you build Gravitino using the `./gradlew assembleDistribution`
command, you can get the compressed Gravitino package with the name
`gravitino-<version>-bin.tar.gz` in the `distribution` directory with sha256
checksum file `gravitino-<version>-bin.tar.gz.sha256`.
+ :::
-If you want to use the relational backend storage, you need to initialize the
RDBMS first. For the details on initializing the RDBMS, please check [How to
use relational backend storage](./how-to-use-relational-backend-storage.md).
+2. Extract the Gravitino package and you should have these directories and
files:
-#### Configure the Apache Gravitino server
+ ```text
+ |── ...
+ └── distribution/package
+ |── bin/
+ | ├── gravitino.sh # Gravitino server Launching
scripts.
+ | └── gravitino-iceberg-rest-server.sh # Gravitino Iceberg REST
server Launching scripts.
+ |── catalogs
+ | └── hadoop/ # Apache Hadoop catalog
dependencies and configurations.
+ | └── hive/ # Apache Hive catalog
dependencies and configurations.
+ | └── jdbc-doris/ # JDBC doris catalog
dependencies and configurations.
+ | └── jdbc-mysql/ # JDBC MySQL catalog
dependencies and configurations.
+ | └── jdbc-postgresql/ # JDBC PostgreSQL catalog
dependencies and configurations.
+ | └── kafka/ # Apache Kafka PostgreSQL
catalog dependencies and configurations.
+ | └── lakehouse-iceberg/ # Apache Iceberg catalog
dependencies and configurations.
+ | └── lakehouse-paimon/ # Apache Paimon catalog
dependencies and configurations.
+ |── conf/ # All configurations for
Gravitino.
+ | ├── gravitino.conf # Gravitino server and
Gravitino Iceberg REST server configuration.
+ | ├── gravitino-iceberg-rest-server.conf # Gravitino server
configuration.
+ | ├── gravitino-env.sh # Environment variables,
etc., JAVA_HOME, GRAVITINO_HOME, and more.
+ | └── log4j2.properties # log4j configuration for
the Gravitino server and Gravitino Iceberg REST server.
+ |── libs/ # Gravitino server
dependencies libraries.
+ |── logs/ # Gravitino server and
Gravitino Iceberg REST server logs. Automatically created after the server
starts.
+ |── data/ # Default directory for the
Gravitino server to store data.
+ |── iceberg-rest-server/ # Gravitino Iceberg REST
server package and dependencies libraries.
+ └── scripts/ # Extra scripts for
Gravitino.
+ ```
-The Gravitino server configuration file is `conf/gravitino.conf`. You can
configure the Gravitino server by modifying this file. Basic configurations
have already been added to this file. All the configurations are listed in
[Gravitino Server Configurations](./gravitino-server-config.md).
+### Initialize the RDBMS (Optional)
-#### Configure the Apache Gravitino server log
+If you want to use the `relational` backend storage, you need to initialize
the RDBMS.
+For details on initializing the RDBMS, see [How to use relational backend
storage](./how-to-use-relational-backend-storage.md).
-The Gravitino server log configuration file is `conf/log4j2.properties`.
Gravitino uses Log4j2 as the Logging system. You can
[Log4j2](https://logging.apache.org/log4j/2.x/) to do the log configuration.
+### Configure the Gravitino server
-#### Configure the Apache Gravitino server environment
+The `conf/gravitino.conf` file provides basic configurations for the Gravitino
server.
+To configure the Gravitino server, you can update this file.
+For a full list of the Gravitino server configurations, see [Gravitino Server
Configurations](./gravitino-server-config.md).
-The Gravitino server environment configuration file is
`conf/gravitino-env.sh`. Gravitino exposes several environment variables. You
can modify them in this file.
+### Configure the Gravitino server log
-#### Configure Apache Gravitino catalogs
+Gravitino uses [Log4j2](https://logging.apache.org/log4j/2.x/) as the logging
system.
+The `conf/log4j2.properties` file provides the Gravitino server log
configurations.
+To configure the Gravitino server log, you can update this file.
-Gravitino supports multiple catalogs. You can configure the catalog-level
configurations by modifying the related configuration file in the
`catalogs/<catalog-provider>/conf` directory. The configurations you set here
apply to all the catalogs of the same type you create.
+### Configure the Gravitino server environment variables
-For example, if you want to configure the Hive catalog, you can modify the
file `catalogs/hive/conf/hive.conf`. The detailed configurations are listed in
the specific catalog documentation.
+Gravitino exposes several environment variables through the
`conf/gravitino-env.sh` file.
+To configure these environment variables, you can update this file.
+
+### Configure the Gravitino catalogs
+
+Gravitino supports multiple catalogs.
+You can configure the catalog-level configurations by updating the related
configuration file in the `catalogs/<catalog-provider>/conf` directory.
+For example, the `catalogs/hive/conf/hive.conf` file provides configurations
for the Hive catalog.
+The configurations you set in the catalog configuration file will apply to all
the catalogs of the same type that you created.
+For detailed configurations about each catalog, see related catalog
documentation.
:::note
-Gravitino takes the catalog configurations in the following order:
+Gravitino takes the catalog configurations in the following order of
precedence:
-1. Catalog `properties` specified in catalog creation API or REST API.
+1. Catalog `properties` specified in the catalog creation API or REST API.
2. Catalog configurations specified in the catalog configuration file.
-The catalog `properties` can override the catalog configurations specified in
the configurationfile.
+The catalog `properties` overrides the catalog configurations specified in the
configuration file.
:::
-Gravitino supports passing in catalog-specific configurations if you add
`gravitino.bypass.`. For example, if you want to pass in the HMS-specific
configuration `hive.metastore.client.capability.check` to the underlying Hive
client in the Hive catalog, add the `gravitino.bypass.` prefix.
+Gravitino supports passing in catalog-specific configurations by adding the
`gravitino.bypass.` prefix to the specific catalog configurations.
+For example, you can pass in the HMS-specific configuration
`hive.metastore.client.capability.check` to the underlying Hive client in the
Hive catalog if you add the `gravitino.bypass.` prefix to the configuration.
-Also, Gravitino supports loading catalog-specific configurations from external
files. For example,you can put your own `hive-site.xml` file in the
`catalogs/hive/conf` directory, and Gravitino loads it automatically.
+Gravitino also supports loading catalog-specific configurations from external
files.
+You just need to put your `hive-site.xml` file in the `catalogs/hive/conf`
directory, and Gravitino will load it automatically.
-#### Start Apache Gravitino server
+### Start the Gravitino server
-After configuring the Gravitino server, start the Gravitino server by running:
+- Start the Gravitino server
-```shell
-./bin/gravitino.sh start
-```
+ After configuring the Gravitino server, run the following command to start
the Gravitino server:
-Alternatively, to start the Gravitino server Web UI, please run:
+ ```shell
+ ./bin/gravitino.sh start
+ ```
-```shell
-./bin/gravitino.sh run
-```
+- Start the Gravitino server with the Web UI
-You can access the Gravitino Web UI by typing
[http://localhost:8090](http://localhost:8090) in your browser, or you can run:
+ By default, the Gravitino server also provides a Web UI on port `8090`.
+ Run the following command to start the Gravitino server with the Web UI:
+
+ ```shell
+ ./bin/gravitino.sh run
+ ```
+
+After starting the Gravitino server with the Web UI, visit
`http://localhost:8090` in your browser to access the Gravitino server through
the Web UI.
+Or, you can run the following command to verify that the Gravitino server is
running:
```shell
curl -v -X GET -H "Accept: application/vnd.gravitino.v1+json" -H
"Content-Type: application/json" http://localhost:8090/api/version
```
-to make sure Gravitino is running.
-
:::info
-If you need to debug the Gravitino server, enable the `GRAVITINO_DEBUG_OPTS`
environment variable in the `conf/gravitino-env.sh` file. Then create a `Remote
JVM Debug` configuration in `IntelliJ IDEA` and debug `gravitino.server.main`.
+If you need to debug the Gravitino server, configure the
`GRAVITINO_DEBUG_OPTS` environment variable in the `conf/gravitino-env.sh` file.
+Then create a `Remote JVM Debug` configuration in `IntelliJ IDEA` and debug
`gravitino.server.main`.
:::
-#### Manage Gravitino Iceberg REST server in Gravitino package
+### Start the Gravitino Iceberg REST server
-You can run the Iceberg REST server as either a standalone server or as an
auxiliary service embedded in the Gravitino server. To start it as a standalone
server, use the command `./bin/gravitino-iceberg-rest-server.sh start` with
configurations specified in `./conf/gravitino-iceberg-rest-server.conf`.
Alternatively, use `./bin/gravitino.sh start` to launch a Gravitino server that
integrates both the Iceberg REST service and the Gravitino service, with all
configurations centralized in `conf/gravitino.conf`.
+You can run the Iceberg REST server either as a standalone server or as an
auxiliary service embedded in the Gravitino server.
-For more detailed information about the Gravitino Iceberg REST server, please
refer to [Iceberg REST server document](./iceberg-rest-service.md).
+- To start the Iceberg REST server as a standalone server, run the
`./bin/gravitino-iceberg-rest-server.sh start` command with configurations
specified in the `./conf/gravitino-iceberg-rest-server.conf` file.
+- To start the Iceberg REST server as an auxiliary service embedded in the
Gravitino server, run the `./bin/gravitino.sh start` command with all
configurations in the `conf/gravitino.conf` file.
-## Install Apache Gravitino using Docker
+For details about the Gravitino Iceberg REST server, see the [Gravitino
Iceberg REST server documentation](./iceberg-rest-service.md).
-### Get the Apache Gravitino Docker image
+## Install Apache Gravitino using Docker
Gravitino publishes the Docker image to [Docker
Hub](https://hub.docker.com/r/apache/gravitino/tags).
-Run the Gravitino Docker image by running:
+The published Gravitino Docker image only contains the Gravitino server with
basic configurations.
-```shell
-docker run -d -i -p 8090:8090 apache/gravitino:<version>
-```
+### Prerequisites
-Access the Gravitino Web UI by typing `http://localhost:8090` in your browser,
or you
-can run
+If you have not installed Docker, download and install it by following the
[instructions](https://docs.docker.com/get-started/get-docker/) for your
Operating System (OS).
-```shell
-curl -v -X GET -H "Accept: application/vnd.gravitino.v1+json" -H
"Content-Type: application/json" http://localhost:8090/api/version
-```
+### Steps
+
+1. Start the Gravitino server.
+
+ ```shell
+ docker run -d -i -p 8090:8090 apache/gravitino:<version>
+ ```
-to make sure Gravitino is running.
+2. Visit `http://localhost:8090` in your browser to access the Gravitino
server through the Web UI.
+Or, you can run the following command to verify that THE Gravitino server is
running.
Review Comment:
THE -> the
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]