This is an automated email from the ASF dual-hosted git repository.
wuchunfu pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git
The following commit(s) were added to refs/heads/dev by this push:
new 0e61faf142 [Doc] hdfs file doc correct (#7216)
0e61faf142 is described below
commit 0e61faf1423085c3a5cb34b50392faa2e26730f6
Author: Jarvis <[email protected]>
AuthorDate: Wed Jul 17 17:39:25 2024 +0800
[Doc] hdfs file doc correct (#7216)
---
docs/en/connector-v2/source/HdfsFile.md | 2 +-
docs/zh/connector-v2/source/HdfsFile.md | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/docs/en/connector-v2/source/HdfsFile.md
b/docs/en/connector-v2/source/HdfsFile.md
index c37f3fb121..20a2559ddb 100644
--- a/docs/en/connector-v2/source/HdfsFile.md
+++ b/docs/en/connector-v2/source/HdfsFile.md
@@ -46,7 +46,7 @@ Read data from hdfs file system.
| path | string | yes | - | The
source file path.
|
| file_format_type | string | yes | - | We
supported as the following file types:`text` `csv` `parquet` `orc` `json`
`excel` `xml` `binary`.Please note that, The final file name will end with the
file_format's suffix, the suffix of the text file is `txt`.
|
| fs.defaultFS | string | yes | - | The
hadoop cluster address that start with `hdfs://`, for example:
`hdfs://hadoopcluster`
|
-| read_columns | list | yes | - | The
read column list of the data source, user can use it to implement field
projection.The file type supported column projection as the following
shown:[text,json,csv,orc,parquet,excel,xml].Tips: If the user wants to use this
feature when reading `text` `json` `csv` files, the schema option must be
configured. |
+| read_columns | list | no | - | The
read column list of the data source, user can use it to implement field
projection.The file type supported column projection as the following
shown:[text,json,csv,orc,parquet,excel,xml].Tips: If the user wants to use this
feature when reading `text` `json` `csv` files, the schema option must be
configured. |
| hdfs_site_path | string | no | - | The
path of `hdfs-site.xml`, used to load ha configuration of namenodes
|
| delimiter/field_delimiter | string | no | \001 | Field
delimiter, used to tell connector how to slice and dice fields when reading
text files. default `\001`, the same as hive's default delimiter
|
| parse_partition_from_path | boolean | no | true |
Control whether parse the partition keys and values from file path. For example
if you read a file from path
`hdfs://hadoop-cluster/tmp/seatunnel/parquet/name=tyrantlucifer/age=26`. Every
record data from file will be added these two
fields:[name:tyrantlucifer,age:26].Tips:Do not define partition fields in
schema option. |
diff --git a/docs/zh/connector-v2/source/HdfsFile.md
b/docs/zh/connector-v2/source/HdfsFile.md
index efb24571c8..efce1d1401 100644
--- a/docs/zh/connector-v2/source/HdfsFile.md
+++ b/docs/zh/connector-v2/source/HdfsFile.md
@@ -44,7 +44,7 @@
| path | string | 是 | - | 源文件路径。
|
| file_format_type | string | 是 | - |
我们支持以下文件类型:`text` `json` `csv` `orc` `parquet`
`excel`。请注意,最终文件名将以文件格式的后缀结束,文本文件的后缀是 `txt`。
|
| fs.defaultFS | string | 是 | - | 以 `hdfs://`
开头的 Hadoop 集群地址,例如:`hdfs://hadoopcluster`。
|
-| read_columns | list | 是 | - |
数据源的读取列列表,用户可以使用它实现字段投影。支持的文件类型的列投影如下所示:[text,json,csv,orc,parquet,excel]。提示:如果用户在读取
`text` `json` `csv` 文件时想要使用此功能,必须配置 schema 选项。
|
+| read_columns | list | 否 | - |
数据源的读取列列表,用户可以使用它实现字段投影。支持的文件类型的列投影如下所示:[text,json,csv,orc,parquet,excel]。提示:如果用户在读取
`text` `json` `csv` 文件时想要使用此功能,必须配置 schema 选项。
|
| hdfs_site_path | string | 否 | - |
`hdfs-site.xml` 的路径,用于加载 namenodes 的 ha 配置。
|
| delimiter/field_delimiter | string | 否 | \001 |
字段分隔符,用于告诉连接器在读取文本文件时如何切分字段。默认 `\001`,与 Hive 的默认分隔符相同。
|
| parse_partition_from_path | boolean | 否 | true |
控制是否从文件路径中解析分区键和值。例如,如果您从路径
`hdfs://hadoop-cluster/tmp/seatunnel/parquet/name=tyrantlucifer/age=26`
读取文件,则来自文件的每条记录数据将添加这两个字段:[name:tyrantlucifer,age:26]。提示:不要在 schema 选项中定义分区字段。
|