This is an automated email from the ASF dual-hosted git repository.
wanghailin pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git
The following commit(s) were added to refs/heads/dev by this push:
new ce39948ca5 [Docs][Connector-V2][Hudi] Reconstruct the Hudi connector
document (#4905)
ce39948ca5 is described below
commit ce39948ca5d1bfae35e4e4dc503d56dc1345cf18
Author: Carl-Zhou-CN <[email protected]>
AuthorDate: Fri Jul 28 15:11:03 2023 +0800
[Docs][Connector-V2][Hudi] Reconstruct the Hudi connector document (#4905)
* [Docs][Connector-V2][Hudi] Reconstruct the Hudi connector document
---------
Co-authored-by: zhouyao <[email protected]>
---
docs/en/connector-v2/source/Hudi.md | 82 ++++++++++++++++++++-----------------
1 file changed, 44 insertions(+), 38 deletions(-)
diff --git a/docs/en/connector-v2/source/Hudi.md
b/docs/en/connector-v2/source/Hudi.md
index cb3b154d58..b70d34608e 100644
--- a/docs/en/connector-v2/source/Hudi.md
+++ b/docs/en/connector-v2/source/Hudi.md
@@ -2,69 +2,67 @@
> Hudi source connector
-## Description
+## Support Those Engines
-Used to read data from Hudi. Currently, only supports hudi cow table and
Snapshot Query with Batch Mode.
+> Spark<br/>
+> Flink<br/>
+> SeaTunnel Zeta<br/>
-In order to use this connector, You must ensure your spark/flink cluster
already integrated hive. The tested hive version is 2.3.9.
-
-## Key features
+## Key Features
- [x] [batch](../../concept/connector-v2-features.md)
-
-Currently, only supports hudi cow table and Snapshot Query with Batch Mode
-
- [ ] [stream](../../concept/connector-v2-features.md)
- [x] [exactly-once](../../concept/connector-v2-features.md)
- [ ] [column projection](../../concept/connector-v2-features.md)
- [x] [parallelism](../../concept/connector-v2-features.md)
- [ ] [support user-defined split](../../concept/connector-v2-features.md)
-## Options
-
-| name | type | required | default
value |
-|-------------------------|---------|------------------------------|---------------|
-| table.path | string | yes | -
|
-| table.type | string | yes | -
|
-| conf.files | string | yes | -
|
-| use.kerberos | boolean | no | false
|
-| kerberos.principal | string | yes when use.kerberos = true | -
|
-| kerberos.principal.file | string | yes when use.kerberos = true | -
|
-| common-options | config | no | -
|
-
-### table.path [string]
-
-`table.path` The hdfs root path of hudi table,such as
'hdfs://nameserivce/data/hudi/hudi_table/'.
+## Description
-### table.type [string]
+Used to read data from Hudi. Currently, only supports hudi cow table and
Snapshot Query with Batch Mode.
-`table.type` The type of hudi table. Now we only support 'cow', 'mor' is not
support yet.
+In order to use this connector, You must ensure your spark/flink cluster
already integrated hive. The tested hive version is 2.3.9.
-### conf.files [string]
+## Supported DataSource Info
-`conf.files` The environment conf file path list(local path), which used to
init hdfs client to read hudi table file. The example is
'/home/test/hdfs-site.xml;/home/test/core-site.xml;/home/test/yarn-site.xml'.
+:::tip
-### use.kerberos [boolean]
+* Currently, only supports Hudi cow table and Snapshot Query with Batch Mode
-`use.kerberos` Whether to enable Kerberos, default is false.
+:::
-### kerberos.principal [string]
+## Data Type Mapping
-`kerberos.principal` When use kerberos, we should set kerberos princal such as
'test_user@xxx'.
+| Hudi Data type | Seatunnel Data type |
+|----------------|---------------------|
+| ALL TYPE | STRING |
-### kerberos.principal.file [string]
+## Source Options
-`kerberos.principal.file` When use kerberos, we should set kerberos princal
file such as '/home/test/test_user.keytab'.
+| Name | Type | Required | Default |
Description
|
+|-------------------------|--------|------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| table.path | String | Yes | - |
The hdfs root path of hudi table,such as
'hdfs://nameserivce/data/hudi/hudi_table/'.
|
+| table.type | String | Yes | - |
The type of hudi table. Now we only support 'cow', 'mor' is not support yet.
|
+| conf.files | String | Yes | - |
The environment conf file path list(local path), which used to init hdfs client
to read hudi table file. The example is
'/home/test/hdfs-site.xml;/home/test/core-site.xml;/home/test/yarn-site.xml'. |
+| use.kerberos | bool | No | false |
Whether to enable Kerberos, default is false.
|
+| kerberos.principal | String | yes when use.kerberos = true | - |
When use kerberos, we should set kerberos principal such as 'test_user@xxx'.
|
+| kerberos.principal.file | string | yes when use.kerberos = true | - |
When use kerberos, we should set kerberos principal file such as
'/home/test/test_user.keytab'.
|
+| common-options | config | No | - |
Source plugin common parameters, please refer to [Source Common
Options](common-options.md) for details.
|
-### common options
+## Task Example
-Source plugin common parameters, please refer to [Source Common
Options](common-options.md) for details.
+### Simple:
-## Examples
+> This example reads from a Hudi COW table and configures Kerberos for the
environment, printing to the console.
```hocon
-source {
-
+# Defining the runtime environment
+env {
+ # You can set flink configuration here
+ execution.parallelism = 2
+ job.mode = "BATCH"
+}
+source{
Hudi {
table.path = "hdfs://nameserivce/data/hudi/hudi_table/"
table.type = "cow"
@@ -73,7 +71,15 @@ source {
kerberos.principal = "test_user@xxx"
kerberos.principal.file = "/home/test/test_user.keytab"
}
+}
+
+transform {
+ # If you would like to get more information about how to configure
seatunnel and see full list of transform plugins,
+ # please go to https://seatunnel.apache.org/docs/transform-v2/sql/
+}
+sink {
+ Console {}
}
```