This is an automated email from the ASF dual-hosted git repository.
jshao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git
The following commit(s) were added to refs/heads/main by this push:
new 67b7be044 [#4954] docs(hudi-catalog): Add docs for Hudi catalog (#4976)
67b7be044 is described below
commit 67b7be044efca8d2009d12bd9c3c5aadb76dbb22
Author: mchades <[email protected]>
AuthorDate: Mon Oct 14 15:05:18 2024 +0800
[#4954] docs(hudi-catalog): Add docs for Hudi catalog (#4976)
### What changes were proposed in this pull request?
Add docs for Hudi catalog
### Why are the changes needed?
Fix: #4954
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
no need
---
docs/lakehouse-hudi-catalog.md | 110 +++++++++++++++++++++
docs/manage-relational-metadata-using-gravitino.md | 6 ++
2 files changed, 116 insertions(+)
diff --git a/docs/lakehouse-hudi-catalog.md b/docs/lakehouse-hudi-catalog.md
new file mode 100644
index 000000000..be6d328bf
--- /dev/null
+++ b/docs/lakehouse-hudi-catalog.md
@@ -0,0 +1,110 @@
+---
+title: "Hudi catalog"
+slug: /lakehouse-hudi-catalog
+keywords:
+ - lakehouse
+ - hudi
+ - metadata
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Introduction
+
+Apache Gravitino provides the ability to manage Apache Hudi metadata.
+
+### Requirements and limitations
+
+:::info
+Tested and verified with Apache Hudi `0.15.0`.
+:::
+
+## Catalog
+
+### Catalog capabilities
+
+- Works as a catalog proxy, supporting `HMS` as catalog backend.
+- Only support read operations (list and load) for Hudi schemas and tables.
+- Doesn't support timeline management operations now.
+
+### Catalog properties
+
+| Property name | Description
| Default value | Required | Since Version
|
+|------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `catalog-backend` | Catalog backend of Gravitino Hudi
catalog. Only supports `hms` now.
| (none) | Yes |
0.7.0-incubating |
+| `uri` | The URI associated with the
backend. Such as `thrift://127.0.0.1:9083` for HMS backend.
| (none) | Yes |
0.7.0-incubating |
+| `client.pool-size` | For HMS backend. The maximum
number of Hive metastore clients in the pool for Gravitino.
| 1 | No |
0.7.0-incubating |
+| `client.pool-cache.eviction-interval-ms` | For HMS backend. The cache pool
eviction interval.
| 300000 | No |
0.7.0-incubating |
+| `gravitino.bypass.` | Property name with this prefix
passed down to the underlying backend client for use. Such as
`gravitino.bypass.hive.metastore.failure.retries = 3` indicate 3 times of
retries upon failure of Thrift metastore calls for HMS backend. | (none)
| No | 0.7.0-incubating |
+
+### Catalog operations
+
+Please refer to [Manage Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#catalog-operations)
for more details.
+
+## Schema
+
+### Schema capabilities
+
+- Only support read operations: listSchema, loadSchema, and schemaExists.
+
+### Schema properties
+
+- The `Location` is an optional property that shows the storage path to the
Hudi database
+
+### Schema operations
+
+Only support read operations: listSchema, loadSchema, and schemaExists.
+Please refer to [Manage Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#schema-operations)
for more details.
+
+## Table
+
+### Table capabilities
+
+- Only support read operations: listTable, loadTable, and tableExists.
+
+### Table partitions
+
+- Support loading Hudi partitioned tables (Hudi only supports identity
partitioning).
+
+### Table sort orders
+
+- Doesn't support table sort orders.
+
+### Table distributions
+
+- Doesn't support table distributions.
+
+### Table indexes
+
+- Doesn't support table indexes.
+
+### Table properties
+
+- For HMS backend, it will bring out all the table parameters from the HMS.
+
+### Table column types
+
+The following table shows the mapping between Gravitino and [Apache Hudi
column types](https://hudi.apache.org/docs/sql_ddl#supported-types):
+
+| Gravitino Type | Apache Hudi Type |
+|----------------|------------------|
+| `boolean` | `boolean` |
+| `integer` | `int` |
+| `long` | `long` |
+| `date` | `date` |
+| `timestamp` | `timestamp` |
+| `float` | `float` |
+| `double` | `double` |
+| `string` | `string` |
+| `decimal` | `decimal` |
+| `binary` | `bytes` |
+| `array` | `array` |
+| `map` | `map` |
+| `struct` | `struct` |
+
+### Table operations
+
+Only support read operations: listTable, loadTable, and tableExists.
+Please refer to [Manage Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#table-operations)
for more details.
diff --git a/docs/manage-relational-metadata-using-gravitino.md
b/docs/manage-relational-metadata-using-gravitino.md
index fa2a11ac4..f810b4aa3 100644
--- a/docs/manage-relational-metadata-using-gravitino.md
+++ b/docs/manage-relational-metadata-using-gravitino.md
@@ -24,6 +24,7 @@ For more details, please refer to the related doc.
- [**Apache Doris**](./jdbc-doris-catalog.md)
- [**Apache Iceberg**](./lakehouse-iceberg-catalog.md)
- [**Apache Paimon**](./lakehouse-paimon-catalog.md)
+- [**Apache Hudi**](./lakehouse-hudi-catalog.md)
Assuming:
@@ -93,6 +94,7 @@ Currently, Gravitino supports the following catalog providers:
| `hive` | [Hive catalog
property](./apache-hive-catalog.md#catalog-properties) |
| `lakehouse-iceberg` | [Iceberg catalog
property](./lakehouse-iceberg-catalog.md#catalog-properties) |
| `lakehouse-paimon` | [Paimon catalog
property](./lakehouse-paimon-catalog.md#catalog-properties) |
+| `lakehouse-hudi` | [Hudi catalog
property](./lakehouse-hudi-catalog.md#catalog-properties) |
| `jdbc-mysql` | [MySQL catalog
property](./jdbc-mysql-catalog.md#catalog-properties) |
| `jdbc-postgresql` | [PostgreSQL catalog
property](./jdbc-postgresql-catalog.md#catalog-properties) |
| `jdbc-doris` | [Doris catalog
property](./jdbc-doris-catalog.md#catalog-properties) |
@@ -326,6 +328,7 @@ Currently, Gravitino supports the following schema property:
| `hive` | [Hive schema
property](./apache-hive-catalog.md#schema-properties) |
| `lakehouse-iceberg` | [Iceberg scheme
property](./lakehouse-iceberg-catalog.md#schema-properties) |
| `lakehouse-paimon` | [Paimon scheme
property](./lakehouse-paimon-catalog.md#schema-properties) |
+| `lakehouse-hudi` | [Hudi scheme
property](./lakehouse-hudi-catalog.md#schema-properties) |
| `jdbc-mysql` | [MySQL schema
property](./jdbc-mysql-catalog.md#schema-properties) |
| `jdbc-postgresql` | [PostgreSQL schema
property](./jdbc-postgresql-catalog.md#schema-properties) |
| `jdbc-doris` | [Doris schema
property](./jdbc-doris-catalog.md#schema-properties) |
@@ -807,6 +810,7 @@ The following is a table of the column default value that
Gravitino supports for
| `hive` | ✘ |
| `lakehouse-iceberg` | ✘ |
| `lakehouse-paimon` | ✘ |
+| `lakehouse-hudi` | ✘ |
| `jdbc-mysql` | ✔ |
| `jdbc-postgresql` | ✔ |
@@ -820,6 +824,7 @@ The following table shows the column auto-increment that
Gravitino supports for
| `hive` | ✘
|
| `lakehouse-iceberg` | ✘
|
| `lakehouse-paimon` | ✘
|
+| `lakehouse-hudi` | ✘
|
| `jdbc-mysql` |
✔([limitations](./jdbc-mysql-catalog.md#table-column-auto-increment)) |
| `jdbc-postgresql` | ✔
|
@@ -832,6 +837,7 @@ The following is the table property that Gravitino supports:
| `hive` | [Hive table
property](./apache-hive-catalog.md#table-properties) | [Hive type
mapping](./apache-hive-catalog.md#table-column-types) |
| `lakehouse-iceberg` | [Iceberg table
property](./lakehouse-iceberg-catalog.md#table-properties) | [Iceberg type
mapping](./lakehouse-iceberg-catalog.md#table-column-types) |
| `lakehouse-paimon` | [Paimon table
property](./lakehouse-paimon-catalog.md#table-properties) | [Paimon type
mapping](./lakehouse-paimon-catalog.md#table-column-types) |
+| `lakehouse-hudi` | [Hudi table
property](./lakehouse-hudi-catalog.md#table-properties) | [Hudi type
mapping](./lakehouse-hudi-catalog.md#table-column-types) |
| `jdbc-mysql` | [MySQL table
property](./jdbc-mysql-catalog.md#table-properties) | [MySQL type
mapping](./jdbc-mysql-catalog.md#table-column-types) |
| `jdbc-postgresql` | [PostgreSQL table
property](./jdbc-postgresql-catalog.md#table-properties) | [PostgreSQL type
mapping](./jdbc-postgresql-catalog.md#table-column-types) |
| `doris` | [Doris table
property](./jdbc-doris-catalog.md#table-properties) | [Doris type
mapping](./jdbc-doris-catalog.md#table-column-types) |