This is an automated email from the ASF dual-hosted git repository.
jshao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git
The following commit(s) were added to refs/heads/main by this push:
new fa9ccd29a [#5114] Improvement:use incubating version in docs (#5207)
fa9ccd29a is described below
commit fa9ccd29a908a51d7b45716b3313e7f0b9ff7b5a
Author: lsyulong <[email protected]>
AuthorDate: Thu Oct 24 14:59:36 2024 +0800
[#5114] Improvement:use incubating version in docs (#5207)
### What changes were proposed in this pull request?
[#5114] Improvement:use incubating version in docs
### Why are the changes needed?
Fix: #5114
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
No need
---------
Co-authored-by: yuqi <[email protected]>
---
docs/apache-hive-catalog.md | 28 ++++-----
docs/flink-connector/flink-catalog-hive.md | 16 ++---
docs/flink-connector/flink-connector.md | 64 ++++++++++----------
docs/hadoop-catalog.md | 14 ++---
docs/how-to-use-gvfs.md | 41 +++++++------
docs/iceberg-rest-service.md | 81 +++++++++++++-------------
docs/security/authorization-pushdown.md | 16 ++---
docs/security/how-to-authenticate.md | 28 ++++-----
docs/spark-connector/spark-catalog-iceberg.md | 24 ++++----
docs/spark-connector/spark-integration-test.md | 20 +++----
10 files changed, 165 insertions(+), 167 deletions(-)
diff --git a/docs/apache-hive-catalog.md b/docs/apache-hive-catalog.md
index 53659355d..732183b3d 100644
--- a/docs/apache-hive-catalog.md
+++ b/docs/apache-hive-catalog.md
@@ -133,25 +133,25 @@ Since 0.6.0-incubating, the data types other than listed
above are mapped to Gra
Table properties supply or set metadata for the underlying Hive tables.
The following table lists predefined table properties for a Hive table.
Additionally, you can define your own key-value pair properties and transmit
them to the underlying Hive database.
-| Property Name | Description
| Default Value
|
Required | Since version |
-|--------------------|-----------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|----------|---------------|
-| `location` | The location for table storage, such as
`/user/hive/warehouse/test_table`.
| HMS uses the database location as the parent directory by
default.
| No | 0.2.0 |
-| `table-type` | Type of the table. Valid values include `MANAGED_TABLE`
and `EXTERNAL_TABLE`.
| `MANAGED_TABLE`
| No
| 0.2.0 |
-| `format` | The table file format. Valid values include `TEXTFILE`,
`SEQUENCEFILE`, `RCFILE`, `ORC`, `PARQUET`, `AVRO`, `JSON`, `CSV`, and `REGEX`.
| `TEXTFILE`
| No
| 0.2.0 |
-| `input-format` | The input format class for the table, such as
`org.apache.hadoop.hive.ql.io.orc.OrcInputFormat`.
| The property `format` sets the default value
`org.apache.hadoop.mapred.TextInputFormat` and can change it to a different
default. | No | 0.2.0 |
-| `output-format` | The output format class for the table, such as
`org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat`.
| The property `format` sets the default value
`org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat` and can change it
to a different default. | No | 0.2.0 |
-| `serde-lib` | The serde library class for the table, such as
`org.apache.hadoop.hive.ql.io.orc.OrcSerde`.
| The property `format` sets the default value
`org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe` and can change it to a
different default. | No | 0.2.0 |
+| Property Name | Description
| Default Value
|
Required | Since version |
+|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|----------|---------------|
+| `location` | The location for table storage, such as
`/user/hive/warehouse/test_table`.
| HMS uses the database location as the parent directory by
default.
| No | 0.2.0 |
+| `table-type` | Type of the table. Valid values include `MANAGED_TABLE`
and `EXTERNAL_TABLE`.
| `MANAGED_TABLE`
| No
| 0.2.0 |
+| `format` | The table file format. Valid values include `TEXTFILE`,
`SEQUENCEFILE`, `RCFILE`, `ORC`, `PARQUET`, `AVRO`, `JSON`, `CSV`, and `REGEX`.
| `TEXTFILE`
| No
| 0.2.0 |
+| `input-format` | The input format class for the table, such as
`org.apache.hadoop.hive.ql.io.orc.OrcInputFormat`.
| The property `format` sets the default value
`org.apache.hadoop.mapred.TextInputFormat` and can change it to a different
default. | No | 0.2.0 |
+| `output-format` | The output format class for the table, such as
`org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat`.
| The property `format` sets the default value
`org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat` and can change it
to a different default. | No | 0.2.0 |
+| `serde-lib` | The serde library class for the table, such as
`org.apache.hadoop.hive.ql.io.orc.OrcSerde`.
| The property `format` sets the default value
`org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe` and can change it to a
different default. | No | 0.2.0 |
| `serde.parameter.` | The prefix of the serde parameter, such as
`"serde.parameter.orc.create.index" = "true"`, indicating `ORC` serde lib to
create row indexes | (none)
| No | 0.2.0 |
Hive automatically adds and manages some reserved properties. Users aren't
allowed to set these properties.
-| Property Name | Description
| Since Version |
-|-------------------------|---------------------------------------------------|---------------|
+| Property Name | Description |
Since Version |
+|-------------------------|-------------------------------------------------|---------------|
| `comment` | Used to store a table comment. |
0.2.0 |
-| `numFiles` | Used to store the number of files in the table.
| 0.2.0 |
-| `totalSize` | Used to store the total size of the table.
| 0.2.0 |
-| `EXTERNAL` | Indicates whether the table is external.
| 0.2.0 |
-| `transient_lastDdlTime` | Used to store the last DDL time of the table.
| 0.2.0 |
+| `numFiles` | Used to store the number of files in the table. |
0.2.0 |
+| `totalSize` | Used to store the total size of the table. |
0.2.0 |
+| `EXTERNAL` | Indicates whether the table is external. |
0.2.0 |
+| `transient_lastDdlTime` | Used to store the last DDL time of the table. |
0.2.0 |
### Table indexes
diff --git a/docs/flink-connector/flink-catalog-hive.md
b/docs/flink-connector/flink-catalog-hive.md
index 136dac3ed..ae5581706 100644
--- a/docs/flink-connector/flink-catalog-hive.md
+++ b/docs/flink-connector/flink-catalog-hive.md
@@ -11,7 +11,7 @@ With the Apache Gravitino Flink connector, accessing data or
managing metadata i
Supports most DDL and DML operations in Flink SQL, except such operations:
-- Function operations
+- Function operations
- Partition operations
- View operations
- Querying UDF
@@ -59,13 +59,13 @@ The configuration of Flink Hive Connector is the same with
the original Flink Hi
Gravitino catalog property names with the prefix `flink.bypass.` are passed to
Flink Hive connector. For example, using `flink.bypass.hive-conf-dir` to pass
the `hive-conf-dir` to the Flink Hive connector.
The validated catalog properties are listed below. Any other properties with
the prefix `flink.bypass.` in Gravitino Catalog will be ignored by Gravitino
Flink Connector.
-| Property name in Gravitino catalog properties | Flink Hive connector
configuration | Description | Since Version |
-|-----------------------------------------------|------------------------------------|-----------------------|---------------|
-| `flink.bypass.default-database` | `default-database`
| Hive default database | 0.6.0 |
-| `flink.bypass.hive-conf-dir` | `hive-conf-dir`
| Hive conf dir | 0.6.0 |
-| `flink.bypass.hive-version` | `hive-version`
| Hive version | 0.6.0 |
-| `flink.bypass.hadoop-conf-dir` | `hadoop-conf-dir`
| Hadoop conf dir | 0.6.0 |
-| `metastore.uris` | `hive.metastore.uris`
| Hive metastore uri | 0.6.0 |
+| Property name in Gravitino catalog properties | Flink Hive connector
configuration | Description | Since Version |
+|-----------------------------------------------|------------------------------------|-----------------------|------------------|
+| `flink.bypass.default-database` | `default-database`
| Hive default database | 0.6.0-incubating |
+| `flink.bypass.hive-conf-dir` | `hive-conf-dir`
| Hive conf dir | 0.6.0-incubating |
+| `flink.bypass.hive-version` | `hive-version`
| Hive version | 0.6.0-incubating |
+| `flink.bypass.hadoop-conf-dir` | `hadoop-conf-dir`
| Hadoop conf dir | 0.6.0-incubating |
+| `metastore.uris` | `hive.metastore.uris`
| Hive metastore uri | 0.6.0-incubating |
:::caution
You can set other hadoop properties (with the prefix `hadoop.`, `dfs.`, `fs.`,
`hive.`) in Gravitino Catalog properties. If so, it will override
diff --git a/docs/flink-connector/flink-connector.md
b/docs/flink-connector/flink-connector.md
index 639dd0d68..8c74a6170 100644
--- a/docs/flink-connector/flink-connector.md
+++ b/docs/flink-connector/flink-connector.md
@@ -7,7 +7,7 @@ license: "This software is licensed under the Apache License
version 2."
## Overview
-The Apache Gravitino Flink connector implements the [Catalog
Store](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/table/catalogs/#catalog-store)
to manage the catalogs under Gravitino.
+The Apache Gravitino Flink connector implements the [Catalog
Store](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/table/catalogs/#catalog-store)
to manage the catalogs under Gravitino.
This capability allows users to perform federation queries, accessing data
from various catalogs through a unified interface and consistent access control.
## Capabilities
@@ -26,11 +26,11 @@ This capability allows users to perform federation queries,
accessing data from
1. [Build](../how-to-build.md) or
[download](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-flink-connector-runtime-1.18)
the Gravitino flink connector runtime jar, and place it to the classpath of
Flink.
2. Configure the Flink configuration to use the Gravitino flink connector.
-| Property | Type | Default Value
| Description |
Required | Since Version |
-|--------------------------------------------------|--------|-------------------|----------------------------------------------------------------------|----------|---------------|
-| table.catalog-store.kind | string |
generic_in_memory | The Catalog Store name, it should set to `gravitino`.
| Yes | 0.6.0 |
-| table.catalog-store.gravitino.gravitino.metalake | string | (none)
| The metalake name that flink connector used to request to Gravitino. | Yes
| 0.6.0 |
-| table.catalog-store.gravitino.gravitino.uri | string | (none)
| The uri of Gravitino server address. | Yes
| 0.6.0 |
+| Property | Type | Default Value
| Description |
Required | Since Version |
+|--------------------------------------------------|--------|-------------------|----------------------------------------------------------------------|----------|------------------|
+| table.catalog-store.kind | string |
generic_in_memory | The Catalog Store name, it should set to `gravitino`.
| Yes | 0.6.0-incubating |
+| table.catalog-store.gravitino.gravitino.metalake | string | (none)
| The metalake name that flink connector used to request to Gravitino. | Yes
| 0.6.0-incubating |
+| table.catalog-store.gravitino.gravitino.uri | string | (none)
| The uri of Gravitino server address. | Yes
| 0.6.0-incubating |
Set the flink configuration in flink-conf.yaml.
```yaml
@@ -48,7 +48,7 @@ EnvironmentSettings.Builder builder =
EnvironmentSettings.newInstance().withConf
TableEnvironment tableEnv =
TableEnvironment.create(builder.inBatchMode().build());
```
-3. Execute the Flink SQL query.
+3. Execute the Flink SQL query.
Suppose there is only one hive catalog with the name `hive` in the metalake
`test`.
@@ -66,28 +66,28 @@ SELECT * FROM hive_students;
Gravitino flink connector support the following datatype mapping between Flink
and Gravitino.
-| Flink Type | Gravitino Type | Since
Version |
-|----------------------------------|-------------------------------|---------------|
-| `array` | `array` | 0.6.0
|
-| `bigint` | `long` | 0.6.0
|
-| `binary` | `fixed` | 0.6.0
|
-| `boolean` | `boolean` | 0.6.0
|
-| `char` | `char` | 0.6.0
|
-| `date` | `date` | 0.6.0
|
-| `decimal` | `decimal` | 0.6.0
|
-| `double` | `double` | 0.6.0
|
-| `float` | `float` | 0.6.0
|
-| `integer` | `integer` | 0.6.0
|
-| `map` | `map` | 0.6.0
|
-| `null` | `null` | 0.6.0
|
-| `row` | `struct` | 0.6.0
|
-| `smallint` | `short` | 0.6.0
|
-| `time` | `time` | 0.6.0
|
-| `timestamp` | `timestamp without time zone` | 0.6.0
|
-| `timestamp without time zone` | `timestamp without time zone` | 0.6.0
|
-| `timestamp with time zone` | `timestamp with time zone` | 0.6.0
|
-| `timestamp with local time zone` | `timestamp with time zone` | 0.6.0
|
-| `timestamp_ltz` | `timestamp with time zone` | 0.6.0
|
-| `tinyint` | `byte` | 0.6.0
|
-| `varbinary` | `binary` | 0.6.0
|
-| `varchar` | `string` | 0.6.0
|
+| Flink Type | Gravitino Type | Since
Version |
+|----------------------------------|-------------------------------|------------------|
+| `array` | `list` |
0.6.0-incubating |
+| `bigint` | `long` |
0.6.0-incubating |
+| `binary` | `fixed` |
0.6.0-incubating |
+| `boolean` | `boolean` |
0.6.0-incubating |
+| `char` | `char` |
0.6.0-incubating |
+| `date` | `date` |
0.6.0-incubating |
+| `decimal` | `decimal` |
0.6.0-incubating |
+| `double` | `double` |
0.6.0-incubating |
+| `float` | `float` |
0.6.0-incubating |
+| `integer` | `integer` |
0.6.0-incubating |
+| `map` | `map` |
0.6.0-incubating |
+| `null` | `null` |
0.6.0-incubating |
+| `row` | `struct` |
0.6.0-incubating |
+| `smallint` | `short` |
0.6.0-incubating |
+| `time` | `time` |
0.6.0-incubating |
+| `timestamp` | `timestamp without time zone` |
0.6.0-incubating |
+| `timestamp without time zone` | `timestamp without time zone` |
0.6.0-incubating |
+| `timestamp with time zone` | `timestamp with time zone` |
0.6.0-incubating |
+| `timestamp with local time zone` | `timestamp with time zone` |
0.6.0-incubating |
+| `timestamp_ltz` | `timestamp with time zone` |
0.6.0-incubating |
+| `tinyint` | `byte` |
0.6.0-incubating |
+| `varbinary` | `binary` |
0.6.0-incubating |
+| `varchar` | `string` |
0.6.0-incubating |
diff --git a/docs/hadoop-catalog.md b/docs/hadoop-catalog.md
index d28e6d93b..4453cb317 100644
--- a/docs/hadoop-catalog.md
+++ b/docs/hadoop-catalog.md
@@ -47,7 +47,7 @@ The Hadoop catalog supports multi-level authentication to
control access, allowi
- **Schema**: Inherits the authentication setting from the catalog if not
explicitly set. For more information about schema settings, please refer to
[Schema properties](#schema-properties).
- **Fileset**: Inherits the authentication setting from the schema if not
explicitly set. For more information about fileset settings, please refer to
[Fileset properties](#fileset-properties).
-The default value of `authentication.impersonation-enable` is false, and the
default value for catalogs about this configuration is false, for
+The default value of `authentication.impersonation-enable` is false, and the
default value for catalogs about this configuration is false, for
schemas and filesets, the default value is inherited from the parent. Value
set by the user will override the parent value, and the priority mechanism is
the same as authentication.
### Catalog operations
@@ -82,12 +82,12 @@ Refer to [Schema
operation](./manage-fileset-metadata-using-gravitino.md#schema-
### Fileset properties
-| Property name | Description
| Default value | Required | Since Version |
-|----------------------------------------------------|--------------------------------------------------------------------------------------------------------|--------------------------|----------|-----------------|
-| `authentication.impersonation-enable` | Whether to enable
impersonation for the Hadoop catalog fileset.
| The parent(schema) value | No | 0.6.0 |
-| `authentication.type` | The type of
authentication for Hadoop catalog fileset, currently we only support
`kerberos`, `simple`. | The parent(schema) value | No | 0.6.0 |
-| `authentication.kerberos.principal` | The principal of the
Kerberos authentication for the fileset.
| The parent(schema) value | No | 0.6.0 |
-| `authentication.kerberos.keytab-uri` | The URI of The keytab
for the Kerberos authentication for the fileset.
| The parent(schema) value | No | 0.6.0 |
+| Property name | Description
| Default
value | Required | Since Version |
+|---------------------------------------|--------------------------------------------------------------------------------------------------------|--------------------------|----------|------------------|
+| `authentication.impersonation-enable` | Whether to enable impersonation for
the Hadoop catalog fileset. | The
parent(schema) value | No | 0.6.0-incubating |
+| `authentication.type` | The type of authentication for
Hadoop catalog fileset, currently we only support `kerberos`, `simple`. | The
parent(schema) value | No | 0.6.0-incubating |
+| `authentication.kerberos.principal` | The principal of the Kerberos
authentication for the fileset. | The
parent(schema) value | No | 0.6.0-incubating |
+| `authentication.kerberos.keytab-uri` | The URI of The keytab for the
Kerberos authentication for the fileset. | The
parent(schema) value | No | 0.6.0-incubating |
### Fileset operations
diff --git a/docs/how-to-use-gvfs.md b/docs/how-to-use-gvfs.md
index f36629737..3a993e708 100644
--- a/docs/how-to-use-gvfs.md
+++ b/docs/how-to-use-gvfs.md
@@ -14,7 +14,7 @@ To use `fileset` managed by Gravitino, Gravitino provides a
virtual file system
the Gravitino Virtual File System (GVFS):
* In Java, it's built on top of the Hadoop Compatible File System(HCFS)
interface.
* In Python, it's built on top of the
[fsspec](https://filesystem-spec.readthedocs.io/en/stable/index.html)
-interface.
+ interface.
GVFS is a virtual layer that manages the files and directories in the fileset
through a virtual
path, without needing to understand the specific storage details of the
fileset. You can access
@@ -164,18 +164,18 @@ fs.getFileStatus(filesetPath);
1. Add the GVFS runtime jar to the Spark environment.
- You can use `--packages` or `--jars` in the Spark submit shell to include
the Gravitino Virtual
- File System runtime jar, like so:
+ You can use `--packages` or `--jars` in the Spark submit shell to include
the Gravitino Virtual
+ File System runtime jar, like so:
```shell
./${SPARK_HOME}/bin/spark-submit --packages
org.apache.gravitino:filesystem-hadoop3-runtime:${version}
```
- If you want to include the Gravitino Virtual File System runtime jar in
your Spark installation, add it to the `${SPARK_HOME}/jars/` folder.
+ If you want to include the Gravitino Virtual File System runtime jar in
your Spark installation, add it to the `${SPARK_HOME}/jars/` folder.
2. Configure the Hadoop configuration when submitting the job.
- You can configure in the shell command in this way:
+ You can configure in the shell command in this way:
```shell
--conf
spark.hadoop.fs.AbstractFileSystem.gvfs.impl=org.apache.gravitino.filesystem.hadoop.Gvfs
@@ -186,7 +186,7 @@ fs.getFileStatus(filesetPath);
3. Perform operations on the fileset storage in your code.
- Finally, you can access the fileset storage in your Spark program:
+ Finally, you can access the fileset storage in your Spark program:
```scala
// Scala code
@@ -206,9 +206,9 @@ For Tensorflow to support GVFS, you need to recompile the
[tensorflow-io](https:
1. First, add a patch and recompile tensorflow-io.
- You need to add a [patch](https://github.com/tensorflow/io/pull/1970) to
support GVFS on
- tensorflow-io. Then you can follow the
[tutorial](https://github.com/tensorflow/io/blob/master/docs/development.md)
- to recompile your code and release the tensorflow-io module.
+ You need to add a [patch](https://github.com/tensorflow/io/pull/1970) to
support GVFS on
+ tensorflow-io. Then you can follow the
[tutorial](https://github.com/tensorflow/io/blob/master/docs/development.md)
+ to recompile your code and release the tensorflow-io module.
2. Then you need to configure the Hadoop configuration.
@@ -335,18 +335,17 @@ to recompile the native libraries like `libhdfs` and
others, and completely repl
### Configuration
-| Configuration item | Description
| Default value | Required | Since
version |
-|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|-----------------------------------|---------------|
-| `server_uri` | The Gravitino server uri, e.g.
`http://localhost:8090`.
| (none) | Yes
| 0.6.0 |
-| `metalake_name` | The metalake name which the fileset belongs to.
| (none) | Yes | 0.6.0
|
-| `cache_size` | The cache capacity of the Gravitino Virtual File
System.
| `20` | No |
0.6.0 |
-| `cache_expired_time` | The value of time that the cache expires after
accessing in the Gravitino Virtual File System. The value is in `seconds`.
| `3600` | No |
0.6.0 |
-| `auth_type` | The auth type to initialize the Gravitino client to
use with the Gravitino Virtual File System. Currently supports `simple` and
`oauth2` auth types. | `simple` | No |
0.6.0 |
-| `oauth2_server_uri` | The auth server URI for the Gravitino client when
using `oauth2` auth type.
| (none) | Yes if you use `oauth2` auth type |
0.7.0 |
-| `oauth2_credential` | The auth credential for the Gravitino client when
using `oauth2` auth type.
| (none) | Yes if you use `oauth2` auth type |
0.7.0 |
-| `oauth2_path` | The auth server path for the Gravitino client when
using `oauth2` auth type. Please remove the first slash `/` from the path, for
example `oauth/token`. | (none) | Yes if you use `oauth2` auth type |
0.7.0 |
-| `oauth2_scope` | The auth scope for the Gravitino client when using
`oauth2` auth type with the Gravitino Virtual File System.
| (none) | Yes if you use `oauth2` auth type |
0.7.0 |
-
+| Configuration item | Description
| Default value | Required | Since
version |
+|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|-----------------------------------|------------------|
+| `server_uri` | The Gravitino server uri, e.g.
`http://localhost:8090`.
| (none) | Yes
| 0.6.0-incubating |
+| `metalake_name` | The metalake name which the fileset belongs to.
| (none) | Yes |
0.6.0-incubating |
+| `cache_size` | The cache capacity of the Gravitino Virtual File
System.
| `20` | No |
0.6.0-incubating |
+| `cache_expired_time` | The value of time that the cache expires after
accessing in the Gravitino Virtual File System. The value is in `seconds`.
| `3600` | No |
0.6.0-incubating |
+| `auth_type` | The auth type to initialize the Gravitino client to
use with the Gravitino Virtual File System. Currently supports `simple` and
`oauth2` auth types. | `simple` | No |
0.6.0-incubating |
+| `oauth2_server_uri` | The auth server URI for the Gravitino client when
using `oauth2` auth type.
| (none) | Yes if you use `oauth2` auth type |
0.7.0-incubating |
+| `oauth2_credential` | The auth credential for the Gravitino client when
using `oauth2` auth type.
| (none) | Yes if you use `oauth2` auth type |
0.7.0-incubating |
+| `oauth2_path` | The auth server path for the Gravitino client when
using `oauth2` auth type. Please remove the first slash `/` from the path, for
example `oauth/token`. | (none) | Yes if you use `oauth2` auth type |
0.7.0-incubating |
+| `oauth2_scope` | The auth scope for the Gravitino client when using
`oauth2` auth type with the Gravitino Virtual File System.
| (none) | Yes if you use `oauth2` auth type |
0.7.0-incubating |
You can configure these properties when obtaining the `Gravitino Virtual
FileSystem` in Python like this:
diff --git a/docs/iceberg-rest-service.md b/docs/iceberg-rest-service.md
index 4217350da..ad44a2014 100644
--- a/docs/iceberg-rest-service.md
+++ b/docs/iceberg-rest-service.md
@@ -27,7 +27,7 @@ There are three deployment scenarios for Gravitino Iceberg
REST server:
- A standalone server in a standalone Gravitino Iceberg REST server package.
- A standalone server in the Gravitino server package.
- An auxiliary service embedded in the Gravitino server.
-
+
For detailed instructions on how to build and install the Gravitino server
package, please refer to [How to build](./how-to-build.md) and [How to
install](./how-to-install.md). To build the Gravitino Iceberg REST server
package, use the command `./gradlew compileIcebergRESTServer -x test`.
Alternatively, to create the corresponding compressed package in the
distribution directory, use `./gradlew assembleIcebergRESTServer -x test`. The
Gravitino Iceberg REST server package includes the fo [...]
```text
@@ -46,7 +46,7 @@ For detailed instructions on how to build and install the
Gravitino server packa
## Apache Gravitino Iceberg REST catalog server configuration
-There are distinct configuration files for standalone and auxiliary server:
`gravitino-iceberg-rest-server.conf` is used for the standalone server, while
`gravitino.conf` is for the auxiliary server. Although the configuration files
differ, the configuration items remain the same.
+There are distinct configuration files for standalone and auxiliary server:
`gravitino-iceberg-rest-server.conf` is used for the standalone server, while
`gravitino.conf` is for the auxiliary server. Although the configuration files
differ, the configuration items remain the same.
Starting with version `0.6.0-incubating`, the prefix
`gravitino.auxService.iceberg-rest.` for auxiliary server configurations has
been deprecated. If both `gravitino.auxService.iceberg-rest.key` and
`gravitino.iceberg-rest.key` are present, the latter will take precedence. The
configurations listed below use the `gravitino.iceberg-rest.` prefix.
@@ -102,13 +102,13 @@ The detailed configuration items are as follows:
Gravitino Iceberg REST service supports using static access-key-id and
secret-access-key to access S3 data.
-| Configuration item | Description
| Default value | Required | Since Version |
-|-----------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
-| `gravitino.iceberg-rest.io-impl` | The IO implementation for
`FileIO` in Iceberg, use `org.apache.iceberg.aws.s3.S3FileIO` for S3.
| (none) | No | 0.6.0 |
-| `gravitino.iceberg-rest.s3-access-key-id` | The static access key ID
used to access S3 data.
| (none) | No | 0.6.0 |
-| `gravitino.iceberg-rest.s3-secret-access-key` | The static secret access key
used to access S3 data.
| (none) | No | 0.6.0 |
-| `gravitino.iceberg-rest.s3-endpoint` | An alternative endpoint of
the S3 service, This could be used for S3FileIO with any s3-compatible object
storage service that has a different endpoint, or access a private S3 endpoint
in a virtual private cloud. | (none) | No | 0.6.0 |
-| `gravitino.iceberg-rest.s3-region` | The region of the S3
service, like `us-west-2`.
| (none) | No | 0.6.0 |
+| Configuration item | Description
| Default value | Required | Since Version |
+|-----------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.io-impl` | The IO implementation for
`FileIO` in Iceberg, use `org.apache.iceberg.aws.s3.S3FileIO` for S3.
| (none) | No | 0.6.0-incubating |
+| `gravitino.iceberg-rest.s3-access-key-id` | The static access key ID
used to access S3 data.
| (none) | No | 0.6.0-incubating |
+| `gravitino.iceberg-rest.s3-secret-access-key` | The static secret access key
used to access S3 data.
| (none) | No | 0.6.0-incubating |
+| `gravitino.iceberg-rest.s3-endpoint` | An alternative endpoint of
the S3 service, This could be used for S3FileIO with any s3-compatible object
storage service that has a different endpoint, or access a private S3 endpoint
in a virtual private cloud. | (none) | No | 0.6.0-incubating |
+| `gravitino.iceberg-rest.s3-region` | The region of the S3
service, like `us-west-2`.
| (none) | No | 0.6.0-incubating |
For other Iceberg s3 properties not managed by Gravitino like `s3.sse.type`,
you could config it directly by `gravitino.iceberg-rest.s3.sse.type`.
@@ -120,12 +120,12 @@ To configure the JDBC catalog backend, set the
`gravitino.iceberg-rest.warehouse
Gravitino Iceberg REST service supports using static access-key-id and
secret-access-key to access OSS data.
-| Configuration item | Description
|
Default value | Required | Since Version |
-|------------------------------------------------|-------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
-| `gravitino.iceberg-rest.io-impl` | The IO implementation for
`FileIO` in Iceberg, use `org.apache.iceberg.aliyun.oss.OSSFileIO` for OSS. |
(none) | No | 0.6.0 |
-| `gravitino.iceberg-rest.oss-access-key-id` | The static access key ID
used to access OSS data. |
(none) | No | 0.7.0 |
-| `gravitino.iceberg-rest.oss-secret-access-key` | The static secret access
key used to access OSS data. |
(none) | No | 0.7.0 |
-| `gravitino.iceberg-rest.oss-endpoint` | The endpoint of Aliyun OSS
service. |
(none) | No | 0.7.0 |
+| Configuration item | Description
|
Default value | Required | Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.io-impl` | The IO implementation for
`FileIO` in Iceberg, use `org.apache.iceberg.aliyun.oss.OSSFileIO` for OSS. |
(none) | No | 0.6.0-incubating |
+| `gravitino.iceberg-rest.oss-access-key-id` | The static access key ID
used to access OSS data. |
(none) | No | 0.7.0-incubating |
+| `gravitino.iceberg-rest.oss-secret-access-key` | The static secret access
key used to access OSS data. |
(none) | No | 0.7.0-incubating |
+| `gravitino.iceberg-rest.oss-endpoint` | The endpoint of Aliyun OSS
service. |
(none) | No | 0.7.0-incubating |
For other Iceberg OSS properties not managed by Gravitino like
`client.security-token`, you could config it directly by
`gravitino.iceberg-rest.client.security-token`.
@@ -137,9 +137,9 @@ Please set the `gravitino.iceberg-rest.warehouse` parameter
to `oss://{bucket_na
Supports using google credential file to access GCS data.
-| Configuration item | Description
| Default value |
Required | Since Version |
-|----------------------------------|----------------------------------------------------------------------------------------------------|---------------|----------|---------------|
-| `gravitino.iceberg-rest.io-impl` | The io implementation for `FileIO` in
Iceberg, use `org.apache.iceberg.gcp.gcs.GCSFileIO` for GCS. | (none) |
No | 0.6.0 |
+| Configuration item | Description
| Default value |
Required | Since Version |
+|----------------------------------|----------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.io-impl` | The io implementation for `FileIO` in
Iceberg, use `org.apache.iceberg.gcp.gcs.GCSFileIO` for GCS. | (none) |
No | 0.6.0-incubating |
For other Iceberg GCS properties not managed by Gravitino like
`gcs.project-id`, you could config it directly by
`gravitino.iceberg-rest.gcs.project-id`.
@@ -161,9 +161,9 @@ Builds with Hadoop 2.10.x. There may be compatibility
issues when accessing Hado
For other storages that are not managed by Gravitino directly, you can manage
them through custom catalog properties.
-| Configuration item | Description
| Default value | Required |
Since Version |
-|----------------------------------|-----------------------------------------------------------------------------------------|---------------|----------|---------------|
-| `gravitino.iceberg-rest.io-impl` | The IO implementation for `FileIO` in
Iceberg, please use the full qualified classname. | (none) | No |
0.6.0 |
+| Configuration item | Description
| Default value | Required |
Since Version |
+|----------------------------------|-----------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.io-impl` | The IO implementation for `FileIO` in
Iceberg, please use the full qualified classname. | (none) | No |
0.6.0-incubating |
To pass custom properties such as `security-token` to your custom `FileIO`,
you can directly configure it by `gravitino.iceberg-rest.security-token`.
`security-token` will be included in the properties when the initialize method
of `FileIO` is invoked.
@@ -206,10 +206,11 @@ You must download the corresponding JDBC driver to the
`iceberg-rest-server/libs
:::
#### Custom backend configuration
-| Configuration item | Description
| Default value | Required | Since Version |
-|------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|------------------|----------|---------------|
-| `gravitino.iceberg-rest.catalog-backend` | The Catalog backend of the
Gravitino Iceberg REST catalog service. Use the value **`custom`** for a Custom
catalog. | `memory` | Yes | 0.2.0 |
-| `gravitino.iceberg-rest.catalog-backend-impl` | The fully-qualified class
name of a custom catalog implementation, only worked if `catalog-backend` is
`custom`. | (none) | No | 0.7.0 |
+
+| Configuration item | Description
| Default value | Required | Since Version |
+|-----------------------------------------------|---------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.catalog-backend` | The Catalog backend of the
Gravitino Iceberg REST catalog service. Use the value **`custom`** for a Custom
catalog. | `memory` | Yes | 0.2.0 |
+| `gravitino.iceberg-rest.catalog-backend-impl` | The fully-qualified class
name of a custom catalog implementation, only worked if `catalog-backend` is
`custom`. | (none) | No | 0.7.0-incubating |
If you want to use a custom Iceberg Catalog as `catalog-backend`, you can add
a corresponding jar file to the classpath and load a custom Iceberg Catalog
implementation by specifying the `catalog-backend-impl` property.
@@ -217,18 +218,17 @@ If you want to use a custom Iceberg Catalog as
`catalog-backend`, you can add a
You could access the view interface if using JDBC backend and enable
`jdbc.schema-version` property.
-| Configuration item | Description
| Default value
| Required | Since Version |
-|-------------------------------------------------|--------------------------------------------------------------------------------------------|---------------|----------|---------------|
-| `gravitino.iceberg-rest.jdbc.schema-version` | The schema version of JDBC
catalog backend, setting to `V1` if supporting view operations. | (none)
| NO | 0.7.0 |
-
+| Configuration item | Description
| Default value |
Required | Since Version |
+|----------------------------------------------|--------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.jdbc.schema-version` | The schema version of JDBC
catalog backend, setting to `V1` if supporting view operations. | (none)
| NO | 0.7.0-incubating |
#### Multi catalog support
The Gravitino Iceberg REST server supports multiple catalogs and offers a
configuration-based catalog management system.
-| Configuration item | Description
|
Default value | Required | Since Version |
-|----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------|----------|---------------|
-| `gravitino.iceberg-rest.catalog-provider` | Catalog provider class name,
you can develop a class that implements `IcebergTableOpsProvider` and add the
corresponding jar file to the Iceberg REST service classpath directory. |
`config-based-provider` | No | 0.7.0 |
+| Configuration item | Description
| Default
value | Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|----------|------------------|
+| `gravitino.iceberg-rest.catalog-provider` | Catalog provider class name, you
can develop a class that implements `IcebergTableOpsProvider` and add the
corresponding jar file to the Iceberg REST service classpath directory. |
`config-based-provider` | No | 0.7.0-incubating |
##### Configuration based catalog provider
@@ -273,11 +273,11 @@ You can access different catalogs by setting the `prefix`
to the specific catalo
When using a Gravitino server based catalog provider, you can leverage
Gravitino to support dynamic catalog management for the Iceberg REST server.
-| Configuration item | Description
| Default value | Required |
Since Version |
-|--------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
-| `gravitino.iceberg-rest.gravitino-uri` | The uri of
Gravitino server address, only worked if `catalog-provider` is
`gravitino-based-provider`. | (none)
| No | 0.7.0 |
-| `gravitino.iceberg-rest.gravitino-metalake` | The metalake
name that `gravitino-based-provider` used to request to Gravitino, only worked
if `catalog-provider` is `gravitino-based-provider`. | (none) | No
| 0.7.0 |
-| `gravitino.iceberg-rest.catalog-cache-eviction-interval-ms` | Catalog cache
eviction interval.
| 3600000 | No |
0.7.0 |
+| Configuration item | Description
| Default value | Required |
Since Version |
+|-------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.gravitino-uri` | The uri of
Gravitino server address, only worked if `catalog-provider` is
`gravitino-based-provider`. | (none)
| No | 0.7.0-incubating |
+| `gravitino.iceberg-rest.gravitino-metalake` | The metalake
name that `gravitino-based-provider` used to request to Gravitino, only worked
if `catalog-provider` is `gravitino-based-provider`. | (none) | No
| 0.7.0-incubating |
+| `gravitino.iceberg-rest.catalog-cache-eviction-interval-ms` | Catalog cache
eviction interval.
| 3600000 | No |
0.7.0-incubating |
```text
gravitino.iceberg-rest.catalog-cache-eviction-interval-ms = 300000
@@ -311,10 +311,9 @@ Gravitino provides a pluggable metrics store interface to
store and delete Icebe
### Misc configurations
-| Configuration item | Description
| Default value | Required | Since Version |
-|---------------------------------------------|--------------------------------------------------------------|---------------|----------|---------------|
-| `gravitino.iceberg-rest.extension-packages` | Comma-separated list of
Iceberg REST API packages to expand. | (none) | No | 0.7.0
|
-
+| Configuration item | Description
| Default value | Required | Since Version |
+|---------------------------------------------|--------------------------------------------------------------|---------------|----------|------------------|
+| `gravitino.iceberg-rest.extension-packages` | Comma-separated list of
Iceberg REST API packages to expand. | (none) | No |
0.7.0-incubating |
## Starting the Iceberg REST server
diff --git a/docs/security/authorization-pushdown.md
b/docs/security/authorization-pushdown.md
index e521402f6..148e76b5f 100644
--- a/docs/security/authorization-pushdown.md
+++ b/docs/security/authorization-pushdown.md
@@ -17,14 +17,14 @@ This module translates Gravitino's authorization model into
the permission rules
In order to use the Authorization Ranger Hive Plugin, you need to configure
the following properties and [Apache Hive catalog
properties](../apache-hive-catalog.md#catalog-properties):
-| Property Name | Description
| Default Value | Required | Since Version |
-|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
-| `authorization-provider` | Providers to use to implement
authorization plugin such as `ranger`.
| (none) | No | 0.6.0
|
-| `authorization.ranger.admin.url` | The Apache Ranger web URIs.
| (none) | No | 0.6.0 |
-| `authorization.ranger.auth.type` | The Apache Ranger authentication type
`simple` or `kerberos`.
| `simple` | No | 0.6.0 |
-| `authorization.ranger.username` | The Apache Ranger admin web login
username (auth type=simple), or kerberos principal(auth type=kerberos), Need
have Ranger administrator permission. | (none) | No | 0.6.0
|
-| `authorization.ranger.password` | The Apache Ranger admin web login user
password (auth type=simple), or path of the keytab file(auth type=kerberos)
| (none) | No | 0.6.0 |
-| `authorization.ranger.service.name` | The Apache Ranger service name.
| (none) | No | 0.6.0 |
+| Property Name | Description
| Default Value | Required | Since Version |
+|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `authorization-provider` | Providers to use to implement
authorization plugin such as `ranger`.
| (none) | No |
0.6.0-incubating |
+| `authorization.ranger.admin.url` | The Apache Ranger web URIs.
| (none) | No | 0.6.0-incubating |
+| `authorization.ranger.auth.type` | The Apache Ranger authentication type
`simple` or `kerberos`.
| `simple` | No | 0.6.0-incubating |
+| `authorization.ranger.username` | The Apache Ranger admin web login
username (auth type=simple), or kerberos principal(auth type=kerberos), Need
have Ranger administrator permission. | (none) | No |
0.6.0-incubating |
+| `authorization.ranger.password` | The Apache Ranger admin web login user
password (auth type=simple), or path of the keytab file(auth type=kerberos)
| (none) | No | 0.6.0-incubating |
+| `authorization.ranger.service.name` | The Apache Ranger service name.
| (none) | No | 0.6.0-incubating |
Once you have used the correct configuration, you can perform authorization
operations by calling Gravitino [authorization RESTful
API](https://gravitino.apache.org/docs/latest/api/rest/grant-roles-to-a-user).
diff --git a/docs/security/how-to-authenticate.md
b/docs/security/how-to-authenticate.md
index c98676350..61c2295f0 100644
--- a/docs/security/how-to-authenticate.md
+++ b/docs/security/how-to-authenticate.md
@@ -40,8 +40,8 @@ Gravitino only supports external OAuth 2.0 servers. To enable
OAuth mode, users
- First, users need to guarantee that the external correctly configured OAuth
2.0 server supports Bearer JWT.
- Then, on the server side, users should set `gravitino.authenticators` as
`oauth` and give
-`gravitino.authenticator.oauth.defaultSignKey`,
`gravitino.authenticator.oauth.serverUri` and
-`gravitino.authenticator.oauth.tokenPath` a proper value.
+ `gravitino.authenticator.oauth.defaultSignKey`,
`gravitino.authenticator.oauth.serverUri` and
+ `gravitino.authenticator.oauth.tokenPath` a proper value.
- Next, for the client side, users can enable `OAuth` mode by the following
code:
```java
@@ -88,18 +88,18 @@ The URI must use the hostname of server instead of IP.
### Server configuration
-| Configuration item | Description
| Default
value | Required | Since version |
-|---------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|--------------------------------------------|---------------|
-| `gravitino.authenticator` | It is deprecated since
Gravitino 0.6.0. Please use `gravitino.authenticators` instead.
| `simple`
| No | 0.3.0 |
-| `gravitino.authenticators` | The authenticators which
Gravitino uses, setting as `simple`,`oauth` or `kerberos`. Multiple
authenticators are separated by commas. If a request is supported by multiple
authenticators simultaneously, the first authenticator will be used by default.
| `simple` | No | 0.6.0
|
-| `gravitino.authenticator.oauth.serviceAudience` | The audience name when
Gravitino uses OAuth as the authenticator.
|
`GravitinoServer` | No | 0.3.0 |
-| `gravitino.authenticator.oauth.allowSkewSecs` | The JWT allows skew
seconds when Gravitino uses OAuth as the authenticator.
| `0`
| No | 0.3.0 |
-| `gravitino.authenticator.oauth.defaultSignKey` | The signing key of JWT
when Gravitino uses OAuth as the authenticator.
| (none)
| Yes if use `oauth` as the authenticator | 0.3.0 |
-| `gravitino.authenticator.oauth.signAlgorithmType` | The signature algorithm
when Gravitino uses OAuth as the authenticator.
| `RS256`
| No | 0.3.0 |
-| `gravitino.authenticator.oauth.serverUri` | The URI of the default
OAuth server.
| (none)
| Yes if use `oauth` as the authenticator | 0.3.0 |
-| `gravitino.authenticator.oauth.tokenPath` | The path for token of
the default OAuth server.
| (none)
| Yes if use `oauth` as the authenticator | 0.3.0 |
-| `gravitino.authenticator.kerberos.principal` | Indicates the Kerberos
principal to be used for HTTP endpoint. Principal should start with `HTTP/`.
| (none)
| Yes if use `kerberos` as the authenticator | 0.4.0 |
-| `gravitino.authenticator.kerberos.keytab` | Location of the keytab
file with the credentials for the principal.
| (none)
| Yes if use `kerberos` as the authenticator | 0.4.0 |
+| Configuration item | Description
| Default
value | Required | Since version |
+|---------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|--------------------------------------------|------------------|
+| `gravitino.authenticator` | It is deprecated since
Gravitino 0.6.0. Please use `gravitino.authenticators` instead.
| `simple`
| No | 0.3.0 |
+| `gravitino.authenticators` | The authenticators which
Gravitino uses, setting as `simple`,`oauth` or `kerberos`. Multiple
authenticators are separated by commas. If a request is supported by multiple
authenticators simultaneously, the first authenticator will be used by default.
| `simple` | No |
0.6.0-incubating |
+| `gravitino.authenticator.oauth.serviceAudience` | The audience name when
Gravitino uses OAuth as the authenticator.
|
`GravitinoServer` | No | 0.3.0
|
+| `gravitino.authenticator.oauth.allowSkewSecs` | The JWT allows skew
seconds when Gravitino uses OAuth as the authenticator.
| `0`
| No | 0.3.0 |
+| `gravitino.authenticator.oauth.defaultSignKey` | The signing key of JWT
when Gravitino uses OAuth as the authenticator.
| (none)
| Yes if use `oauth` as the authenticator | 0.3.0 |
+| `gravitino.authenticator.oauth.signAlgorithmType` | The signature algorithm
when Gravitino uses OAuth as the authenticator.
| `RS256`
| No | 0.3.0 |
+| `gravitino.authenticator.oauth.serverUri` | The URI of the default
OAuth server.
| (none)
| Yes if use `oauth` as the authenticator | 0.3.0 |
+| `gravitino.authenticator.oauth.tokenPath` | The path for token of
the default OAuth server.
| (none)
| Yes if use `oauth` as the authenticator | 0.3.0 |
+| `gravitino.authenticator.kerberos.principal` | Indicates the Kerberos
principal to be used for HTTP endpoint. Principal should start with `HTTP/`.
| (none)
| Yes if use `kerberos` as the authenticator | 0.4.0 |
+| `gravitino.authenticator.kerberos.keytab` | Location of the keytab
file with the credentials for the principal.
| (none)
| Yes if use `kerberos` as the authenticator | 0.4.0 |
The signature algorithms that Gravitino supports follows:
diff --git a/docs/spark-connector/spark-catalog-iceberg.md
b/docs/spark-connector/spark-catalog-iceberg.md
index f0b1f2f64..dca23db6a 100644
--- a/docs/spark-connector/spark-catalog-iceberg.md
+++ b/docs/spark-connector/spark-catalog-iceberg.md
@@ -95,23 +95,23 @@ SELECT * FROM employee FOR SYSTEM_TIME AS OF '2024-05-27
01:01:00';
DESC EXTENDED employee;
```
-For more details about `CALL`, please refer to the [Spark Procedures
description](https://iceberg.apache.org/docs/1.5.2/spark-procedures/#spark-procedures)
in Iceberg official document.
+For more details about `CALL`, please refer to the [Spark Procedures
description](https://iceberg.apache.org/docs/1.5.2/spark-procedures/#spark-procedures)
in Iceberg official document.
## Catalog properties
Gravitino spark connector will transform below property names which are
defined in catalog properties to Spark Iceberg connector configuration.
-| Gravitino catalog property name | Spark Iceberg connector configuration |
Description
| Since Version |
-|---------------------------------|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|
-| `catalog-backend` | `type` |
Catalog backend type
| 0.5.0 |
-| `uri` | `uri` |
Catalog backend uri
| 0.5.0 |
-| `warehouse` | `warehouse` |
Catalog backend warehouse
| 0.5.0 |
-| `jdbc-user` | `jdbc.user` |
JDBC user name
| 0.5.0 |
-| `jdbc-password` | `jdbc.password` |
JDBC password
| 0.5.0 |
-| `io-impl` | `io-impl` |
The io implementation for `FileIO` in Iceberg.
| 0.6.0 |
-| `s3-endpoint` | `s3.endpoint` | An
alternative endpoint of the S3 service, This could be used for S3FileIO with
any s3-compatible object storage service that has a different endpoint, or
access a private S3 endpoint in a virtual private cloud. | 0.6.0 |
-| `s3-region` | `client.region` |
The region of the S3 service, like `us-west-2`.
| 0.6.0 |
-| `oss-endpoint` | `oss.endpoint` |
The endpoint of Aliyun OSS service.
| 0.7.0 |
+| Gravitino catalog property name | Spark Iceberg connector configuration |
Description
| Since Version |
+|---------------------------------|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|
+| `catalog-backend` | `type` |
Catalog backend type
| 0.5.0 |
+| `uri` | `uri` |
Catalog backend uri
| 0.5.0 |
+| `warehouse` | `warehouse` |
Catalog backend warehouse
| 0.5.0 |
+| `jdbc-user` | `jdbc.user` |
JDBC user name
| 0.5.0 |
+| `jdbc-password` | `jdbc.password` |
JDBC password
| 0.5.0 |
+| `io-impl` | `io-impl` |
The io implementation for `FileIO` in Iceberg.
| 0.6.0-incubating |
+| `s3-endpoint` | `s3.endpoint` | An
alternative endpoint of the S3 service, This could be used for S3FileIO with
any s3-compatible object storage service that has a different endpoint, or
access a private S3 endpoint in a virtual private cloud. | 0.6.0-incubating |
+| `s3-region` | `client.region` |
The region of the S3 service, like `us-west-2`.
| 0.6.0-incubating |
+| `oss-endpoint` | `oss.endpoint` |
The endpoint of Aliyun OSS service.
| 0.7.0-incubating |
Gravitino catalog property names with the prefix `spark.bypass.` are passed to
Spark Iceberg connector. For example, using `spark.bypass.clients` to pass the
`clients` to the Spark Iceberg connector.
diff --git a/docs/spark-connector/spark-integration-test.md
b/docs/spark-connector/spark-integration-test.md
index 35ad27b56..cba1c104d 100644
--- a/docs/spark-connector/spark-integration-test.md
+++ b/docs/spark-connector/spark-integration-test.md
@@ -7,7 +7,7 @@ license: "This software is licensed under the Apache License
version 2."
## Overview
-There are two types of integration tests in spark connector, normal
integration test like `SparkXXCatalogIT`, and the golden file integration test.
+There are two types of integration tests in spark connector, normal
integration test like `SparkXXCatalogIT`, and the golden file integration test.
## Normal integration test
@@ -28,15 +28,15 @@ Golden file integration test are mainly to test the
correctness of the SQL resul
Please change the Spark version number if you want to test other Spark
versions.
If you want to change the test behaviour, please modify
`spark-connector/spark-common/src/test/resources/spark-test.conf`.
-| Configuration item | Description
|
Default value | Required | Since Version
|
-|--------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|----------|---------------|
-| `gravitino.spark.test.dir` | The Spark SQL test base dir,
include `test-sqls` and `data`.
|
`spark-connector/spark-common/src/test/resources/` | No | 0.6.0
|
-| `gravitino.spark.test.sqls` | Specify the test SQLs, using
directory to specify group of SQLs like `test-sqls/hive`, using file path to
specify one SQL like `test-sqls/hive/basic.sql`, use `,` to split multi part |
run all SQLs | No | 0.6.0
|
-| `gravitino.spark.test.generateGoldenFiles` | Whether generate golden files
which are used to check the correctness of the SQL result
|
false | No | 0.6.0
|
-| `gravitino.spark.test.metalake` | The metalake name to run the
test
|
`test` | No | 0.6.0
|
-| `gravitino.spark.test.setupEnv` | Whether to setup Gravitino and
Hive environment
|
`false` | No | 0.6.0
|
-| `gravitino.spark.test.uri` | Gravitino uri address, only
available when `gravitino.spark.test.setupEnv` is false
|
http://127.0.0.1:8090 | No | 0.6.0
|
-| `gravitino.spark.test.iceberg.warehouse` | The warehouse location, only
available when `gravitino.spark.test.setupEnv` is false
|
hdfs://127.0.0.1:9000/user/hive/warehouse-spark-test | No | 0.6.0
|
+| Configuration item | Description
|
Default value | Required | Since Version
|
+|--------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|----------|------------------|
+| `gravitino.spark.test.dir` | The Spark SQL test base dir,
include `test-sqls` and `data`.
|
`spark-connector/spark-common/src/test/resources/` | No |
0.6.0-incubating |
+| `gravitino.spark.test.sqls` | Specify the test SQLs, using
directory to specify group of SQLs like `test-sqls/hive`, using file path to
specify one SQL like `test-sqls/hive/basic.sql`, use `,` to split multi part |
run all SQLs | No |
0.6.0-incubating |
+| `gravitino.spark.test.generateGoldenFiles` | Whether generate golden files
which are used to check the correctness of the SQL result
|
false | No |
0.6.0-incubating |
+| `gravitino.spark.test.metalake` | The metalake name to run the
test
|
`test` | No |
0.6.0-incubating |
+| `gravitino.spark.test.setupEnv` | Whether to setup Gravitino and
Hive environment
|
`false` | No |
0.6.0-incubating |
+| `gravitino.spark.test.uri` | Gravitino uri address, only
available when `gravitino.spark.test.setupEnv` is false
|
http://127.0.0.1:8090 | No |
0.6.0-incubating |
+| `gravitino.spark.test.iceberg.warehouse` | The warehouse location, only
available when `gravitino.spark.test.setupEnv` is false
|
hdfs://127.0.0.1:9000/user/hive/warehouse-spark-test | No |
0.6.0-incubating |
The test SQL files are located in
`spark-connector/spark-common/src/test/resources/` by default. There are three
directories:
- `hive`, SQL tests for Hive catalog.