This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new a8dd5d79233 [Doc](s3-tvf) Update s3 tvf docs. (#582)
a8dd5d79233 is described below
commit a8dd5d79233eb6714cf907424fa005534e26f6f3
Author: Qi Chen <[email protected]>
AuthorDate: Mon May 6 23:15:01 2024 +0800
[Doc](s3-tvf) Update s3 tvf docs. (#582)
---
.../sql-manual/sql-functions/table-functions/s3.md | 56 +++++++++++++++++++---
.../sql-manual/sql-functions/table-functions/s3.md | 55 ++++++++++++++++++---
.../sql-manual/sql-functions/table-functions/s3.md | 55 ++++++++++++++++++---
.../sql-manual/sql-functions/table-functions/s3.md | 56 +++++++++++++++++++---
4 files changed, 194 insertions(+), 28 deletions(-)
diff --git a/docs/sql-manual/sql-functions/table-functions/s3.md
b/docs/sql-manual/sql-functions/table-functions/s3.md
index e410bf39649..51a112a809f 100644
--- a/docs/sql-manual/sql-functions/table-functions/s3.md
+++ b/docs/sql-manual/sql-functions/table-functions/s3.md
@@ -59,11 +59,19 @@ Related parameters for accessing S3:
- `s3.region`: (optional). Mandatory if the Minio has set another region.
Otherwise, `us-east-1` is used by default.
- `s3.session_token`: (optional)
- `use_path_style`: (optional) default `false` . The S3 SDK uses the
virtual-hosted style by default. However, some object storage systems may not
be enabled or support virtual-hosted style access. At this time, we can add the
`use_path_style` parameter to force the use of path style access method.
+- `force_parsing_by_standard_uri`: (optional) default `false` . We can add
`force_parsing_by_standard_uri` parameter to force parsing unstandard uri as
standard uri.
-> Note: URI currently supports three SCHEMA: http://, https:// and s3://.
-> 1. If you use http:// or https://, you will decide whether to use the 'path
style' to access s3 based on the 'use_path_style' parameter
-> 2. If you use s3://, you will use the "virtual-hosted style' to access the
s3, 'use_path_style' parameter is invalid.
-> 3. If the uri path does not exist or the files are empty files, s3 tvf will
return an empty result set.
+> Note:
+> For AWS S3, standard uri styles should be:
+> 1. AWS Client Style(Hadoop S3 Style):
`s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88`
+> 2. Virtual Host Style:
`https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+> 3. Path Style:
`https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+>
+> In addition to supporting the common uri styles of the above three
standards, it also supports some other uri styles (maybe not common, but there
may be):
+> 1. Virtual Host AWS Client (Hadoop S3) Mixed Style:
+>
`s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+> 2. Path AWS Client (Hadoop S3) Mixed Style:
+>
`s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
>
> For detailed use cases, you can refer to Best Practice at the bottom.
@@ -74,7 +82,7 @@ file format parameter:
- `line_delimiter`: (optional) default `\n`.
- `compress_type`: (optional) Currently support
`UNKNOWN/PLAIN/GZ/LZO/BZ2/LZ4FRAME/DEFLATE`. Default value is `UNKNOWN`, it
will automatically infer the type based on the suffix of `uri`.
- The following 6 parameters are used for loading in json format. For
specific usage methods, please refer to: [Json
Load](../../../data-operate/import/import-way/load-json-format.md)
+The following 6 parameters are used for loading in json format. For specific
usage methods, please refer to: [Json
Load](../../../data-operate/import/import-way/load-json-format.md)
- `read_json_by_line`: (optional) default `"true"`
- `strip_outer_array`: (optional) default `"false"`
@@ -83,7 +91,7 @@ file format parameter:
- `num_as_string`: (optional) default `"false"`
- `fuzzy_parse`: (optional) default `"false"`
- <version since="dev">The following 2 parameters are used for loading in
csv format</version>
+The following 2 parameters are used for loading in csv format
- `trim_double_quotes`: Boolean type (optional), the default value is `false`.
True means that the outermost double quotes of each field in the csv file are
trimmed.
- `skip_lines`: Integer type (optional), the default value is 0. It will skip
some lines in the head of csv file. It will be disabled when the format is
`csv_with_names` or `csv_with_names_and_types`.
@@ -172,18 +180,52 @@ select * from s3(
"use_path_style" = "false");
```
+// MinIO
+select * from s3(
+ "uri" = "s3://bucket/file.csv",
+ "s3.endpoint" = "http://172.21.0.101:9000",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "s3.region" = "us-east-1",
+ "format" = "csv"
+);
+
Example of s3://:
```sql
// Note how to write your bucket of URI, no need to set 'use_path_style'.
// s3 will be accessed in 'virtual-hosted style'.
select * from s3(
- "URI" = "s3://bucket.endpoint/file/student.csv",
+ "URI" = "s3://bucket/file/student.csv",
+ "s3.endpoint"= "endpont",
+ "s3.region" = "region",
"s3.access_key"= "ak",
"s3.secret_key" = "sk",
"format" = "csv");
```
+Example of other uri styles:
+
+```sql
+// Virtual Host AWS Client (Hadoop S3) Mixed Style. Used by setting
`use_path_style = false` and `force_parsing_by_standard_uri = true`.
+select * from s3(
+ "URI" =
"s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="false",
+ "force_parsing_by_standard_uri"="true");
+
+// Path AWS Client (Hadoop S3) Mixed Style. Used by setting `use_path_style =
true` and `force_parsing_by_standard_uri = true`.
+select * from s3(
+ "URI" =
"s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="true",
+ "force_parsing_by_standard_uri"="true");
+```
+
**csv format**
`csv` format: Read the file on S3 and process it as a csv file, read the first
line in the file to parse out the table schema. The number of columns in the
first line of the file `n` will be used as the number of columns in the table
schema, and the column names of the table schema will be automatically named
`c1, c2, ..., cn`, and the column type is set to `String` , for example:
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-functions/s3.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-functions/s3.md
index 5ee99684a90..d7904e77006 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-functions/s3.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-functions/s3.md
@@ -62,11 +62,18 @@ S3 tvf中的每一个参数都是一个 `"key"="value"` 对。
- `s3.region`: (选填)。如果Minio服务设置了其他的region,那么必填,否则默认使用`us-east-1`。
- `s3.session_token`: (选填)
- `use_path_style`:(选填) 默认为`false` 。S3 SDK 默认使用 virtual-hosted style
方式。但某些对象存储系统可能没开启或没支持virtual-hosted style 方式的访问,此时我们可以添加 use_path_style 参数来强制使用
path style 方式。比如 `minio`默认情况下只允许`path
style`访问方式,所以在访问minio时要加上`use_path_style=true`。
+- `force_parsing_by_standard_uri`:(选填)默认 `false` 。 我们可以添加
`force_parsing_by_standard_uri` 参数来强制将非标准的 uri 解析为标准 uri。
-> 注意:uri目前支持三种schema:http://, https:// 和 s3://
-> 1. 如果使用http://或https://, 则会根据 'use_path_style' 参数来决定是否使用'path style'方式访问s3
-> 2. 如果使用s3://, 则都使用 'virtual-hosted style' 方式访问s3, 'use_path_style'参数无效。
-> 3. 如果uri路径不存在或文件都是空文件,s3 tvf将返回空集合
+> 对于 AWS S3,标准 uri styles 有以下几种:
+> 1. AWS Client Style(Hadoop S3 Style):
`s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88`。
+> 2. Virtual Host
Style:`https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`。
+> 3. Path
Style:`https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`。
+>
+> 除了支持以上三个标准常见的 uri styles, 还支持其他一些 uri styles(也许不常见,但也有可能有):
+> 1. Virtual Host AWS Client (Hadoop S3) Mixed Style:
+>
`s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+> 2. Path AWS Client (Hadoop S3) Mixed Style:
+>
`s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
>
> 详细使用案例可以参考最下方 Best Practice。
@@ -76,7 +83,7 @@ S3 tvf中的每一个参数都是一个 `"key"="value"` 对。
- `line_delimiter`:(选填) 行分割符,默认为`\n`。
- `compress_type`: (选填) 目前支持 `UNKNOWN/PLAIN/GZ/LZO/BZ2/LZ4FRAME/DEFLATE`。 默认值为
`UNKNOWN`, 将会根据 `uri` 的后缀自动推断类型。
- 下面6个参数是用于json格式的导入,具体使用方法可以参照:[Json
Load](../../../data-operate/import/import-way/load-json-format.md)
+下面6个参数是用于json格式的导入,具体使用方法可以参照:[Json
Load](../../../data-operate/import/import-way/load-json-format.md)
- `read_json_by_line`: (选填) 默认为 `"true"`
- `strip_outer_array`: (选填) 默认为 `"false"`
@@ -85,7 +92,7 @@ S3 tvf中的每一个参数都是一个 `"key"="value"` 对。
- `num_as_string`: (选填) 默认为 `false`
- `fuzzy_parse`: (选填) 默认为 `false`
- <version since="dev">下面2个参数是用于csv格式的导入</version>
+下面2个参数是用于csv格式的导入
- `trim_double_quotes`: 布尔类型,选填,默认值为 `false`,为 `true` 时表示裁剪掉 csv 文件每个字段最外层的双引号
- `skip_lines`: 整数类型,选填,默认值为0,含义为跳过csv文件的前几行。当设置format设置为 `csv_with_names` 或
`csv_with_names_and_types` 时,该参数会失效
@@ -161,6 +168,16 @@ select * from s3(
"format" = "parquet",
"use_path_style" = "false");
+// MinIO
+select * from s3(
+ "uri" = "s3://bucket/file.csv",
+ "s3.endpoint" = "http://172.21.0.101:9000",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "s3.region" = "us-east-1",
+ "format" = "csv"
+);
+
// 百度云bos采用兼容s3协议的virtual-hosted style方式访问s3。
// BOS
select * from s3(
@@ -178,12 +195,36 @@ s3:// 使用示例:
// 注意URI bucket写法, 无需设置use_path_style参数。
// 将采用virtual-hosted style方式访问s3。
select * from s3(
- "uri" = "s3://bucket.endpoint/file/student.csv",
+ "uri" = "s3://bucket/file/student.csv",
+ "s3.endpoint"= "endpont",
+ "s3.region"= "region",
"s3.access_key"= "ak",
"s3.secret_key" = "sk",
"format" = "csv");
```
+其它支持的 uri 风格示例:
+
+```sql
+// Virtual Host AWS Client (Hadoop S3) Mixed Style。通过设置 `use_path_style =
false` 以及 `force_parsing_by_standard_uri = true` 来使用。
+select * from s3(
+ "URI" =
"s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="false",
+ "force_parsing_by_standard_uri"="true");
+
+// Path AWS Client (Hadoop S3) Mixed Style。通过设置 `use_path_style = true` 以及
`force_parsing_by_standard_uri = true` 来使用。
+select * from s3(
+ "URI" =
"s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="true",
+ "force_parsing_by_standard_uri"="true");
+```
+
**csv format**
由于S3 table-valued-function事先并不知道table schema,所以会先读一遍文件来解析出table schema。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
index 5ee99684a90..d7904e77006 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
@@ -62,11 +62,18 @@ S3 tvf中的每一个参数都是一个 `"key"="value"` 对。
- `s3.region`: (选填)。如果Minio服务设置了其他的region,那么必填,否则默认使用`us-east-1`。
- `s3.session_token`: (选填)
- `use_path_style`:(选填) 默认为`false` 。S3 SDK 默认使用 virtual-hosted style
方式。但某些对象存储系统可能没开启或没支持virtual-hosted style 方式的访问,此时我们可以添加 use_path_style 参数来强制使用
path style 方式。比如 `minio`默认情况下只允许`path
style`访问方式,所以在访问minio时要加上`use_path_style=true`。
+- `force_parsing_by_standard_uri`:(选填)默认 `false` 。 我们可以添加
`force_parsing_by_standard_uri` 参数来强制将非标准的 uri 解析为标准 uri。
-> 注意:uri目前支持三种schema:http://, https:// 和 s3://
-> 1. 如果使用http://或https://, 则会根据 'use_path_style' 参数来决定是否使用'path style'方式访问s3
-> 2. 如果使用s3://, 则都使用 'virtual-hosted style' 方式访问s3, 'use_path_style'参数无效。
-> 3. 如果uri路径不存在或文件都是空文件,s3 tvf将返回空集合
+> 对于 AWS S3,标准 uri styles 有以下几种:
+> 1. AWS Client Style(Hadoop S3 Style):
`s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88`。
+> 2. Virtual Host
Style:`https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`。
+> 3. Path
Style:`https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`。
+>
+> 除了支持以上三个标准常见的 uri styles, 还支持其他一些 uri styles(也许不常见,但也有可能有):
+> 1. Virtual Host AWS Client (Hadoop S3) Mixed Style:
+>
`s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+> 2. Path AWS Client (Hadoop S3) Mixed Style:
+>
`s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
>
> 详细使用案例可以参考最下方 Best Practice。
@@ -76,7 +83,7 @@ S3 tvf中的每一个参数都是一个 `"key"="value"` 对。
- `line_delimiter`:(选填) 行分割符,默认为`\n`。
- `compress_type`: (选填) 目前支持 `UNKNOWN/PLAIN/GZ/LZO/BZ2/LZ4FRAME/DEFLATE`。 默认值为
`UNKNOWN`, 将会根据 `uri` 的后缀自动推断类型。
- 下面6个参数是用于json格式的导入,具体使用方法可以参照:[Json
Load](../../../data-operate/import/import-way/load-json-format.md)
+下面6个参数是用于json格式的导入,具体使用方法可以参照:[Json
Load](../../../data-operate/import/import-way/load-json-format.md)
- `read_json_by_line`: (选填) 默认为 `"true"`
- `strip_outer_array`: (选填) 默认为 `"false"`
@@ -85,7 +92,7 @@ S3 tvf中的每一个参数都是一个 `"key"="value"` 对。
- `num_as_string`: (选填) 默认为 `false`
- `fuzzy_parse`: (选填) 默认为 `false`
- <version since="dev">下面2个参数是用于csv格式的导入</version>
+下面2个参数是用于csv格式的导入
- `trim_double_quotes`: 布尔类型,选填,默认值为 `false`,为 `true` 时表示裁剪掉 csv 文件每个字段最外层的双引号
- `skip_lines`: 整数类型,选填,默认值为0,含义为跳过csv文件的前几行。当设置format设置为 `csv_with_names` 或
`csv_with_names_and_types` 时,该参数会失效
@@ -161,6 +168,16 @@ select * from s3(
"format" = "parquet",
"use_path_style" = "false");
+// MinIO
+select * from s3(
+ "uri" = "s3://bucket/file.csv",
+ "s3.endpoint" = "http://172.21.0.101:9000",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "s3.region" = "us-east-1",
+ "format" = "csv"
+);
+
// 百度云bos采用兼容s3协议的virtual-hosted style方式访问s3。
// BOS
select * from s3(
@@ -178,12 +195,36 @@ s3:// 使用示例:
// 注意URI bucket写法, 无需设置use_path_style参数。
// 将采用virtual-hosted style方式访问s3。
select * from s3(
- "uri" = "s3://bucket.endpoint/file/student.csv",
+ "uri" = "s3://bucket/file/student.csv",
+ "s3.endpoint"= "endpont",
+ "s3.region"= "region",
"s3.access_key"= "ak",
"s3.secret_key" = "sk",
"format" = "csv");
```
+其它支持的 uri 风格示例:
+
+```sql
+// Virtual Host AWS Client (Hadoop S3) Mixed Style。通过设置 `use_path_style =
false` 以及 `force_parsing_by_standard_uri = true` 来使用。
+select * from s3(
+ "URI" =
"s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="false",
+ "force_parsing_by_standard_uri"="true");
+
+// Path AWS Client (Hadoop S3) Mixed Style。通过设置 `use_path_style = true` 以及
`force_parsing_by_standard_uri = true` 来使用。
+select * from s3(
+ "URI" =
"s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="true",
+ "force_parsing_by_standard_uri"="true");
+```
+
**csv format**
由于S3 table-valued-function事先并不知道table schema,所以会先读一遍文件来解析出table schema。
diff --git
a/versioned_docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
b/versioned_docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
index e410bf39649..51a112a809f 100644
--- a/versioned_docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
+++ b/versioned_docs/version-2.1/sql-manual/sql-functions/table-functions/s3.md
@@ -59,11 +59,19 @@ Related parameters for accessing S3:
- `s3.region`: (optional). Mandatory if the Minio has set another region.
Otherwise, `us-east-1` is used by default.
- `s3.session_token`: (optional)
- `use_path_style`: (optional) default `false` . The S3 SDK uses the
virtual-hosted style by default. However, some object storage systems may not
be enabled or support virtual-hosted style access. At this time, we can add the
`use_path_style` parameter to force the use of path style access method.
+- `force_parsing_by_standard_uri`: (optional) default `false` . We can add
`force_parsing_by_standard_uri` parameter to force parsing unstandard uri as
standard uri.
-> Note: URI currently supports three SCHEMA: http://, https:// and s3://.
-> 1. If you use http:// or https://, you will decide whether to use the 'path
style' to access s3 based on the 'use_path_style' parameter
-> 2. If you use s3://, you will use the "virtual-hosted style' to access the
s3, 'use_path_style' parameter is invalid.
-> 3. If the uri path does not exist or the files are empty files, s3 tvf will
return an empty result set.
+> Note:
+> For AWS S3, standard uri styles should be:
+> 1. AWS Client Style(Hadoop S3 Style):
`s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88`
+> 2. Virtual Host Style:
`https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+> 3. Path Style:
`https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+>
+> In addition to supporting the common uri styles of the above three
standards, it also supports some other uri styles (maybe not common, but there
may be):
+> 1. Virtual Host AWS Client (Hadoop S3) Mixed Style:
+>
`s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
+> 2. Path AWS Client (Hadoop S3) Mixed Style:
+>
`s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
>
> For detailed use cases, you can refer to Best Practice at the bottom.
@@ -74,7 +82,7 @@ file format parameter:
- `line_delimiter`: (optional) default `\n`.
- `compress_type`: (optional) Currently support
`UNKNOWN/PLAIN/GZ/LZO/BZ2/LZ4FRAME/DEFLATE`. Default value is `UNKNOWN`, it
will automatically infer the type based on the suffix of `uri`.
- The following 6 parameters are used for loading in json format. For
specific usage methods, please refer to: [Json
Load](../../../data-operate/import/import-way/load-json-format.md)
+The following 6 parameters are used for loading in json format. For specific
usage methods, please refer to: [Json
Load](../../../data-operate/import/import-way/load-json-format.md)
- `read_json_by_line`: (optional) default `"true"`
- `strip_outer_array`: (optional) default `"false"`
@@ -83,7 +91,7 @@ file format parameter:
- `num_as_string`: (optional) default `"false"`
- `fuzzy_parse`: (optional) default `"false"`
- <version since="dev">The following 2 parameters are used for loading in
csv format</version>
+The following 2 parameters are used for loading in csv format
- `trim_double_quotes`: Boolean type (optional), the default value is `false`.
True means that the outermost double quotes of each field in the csv file are
trimmed.
- `skip_lines`: Integer type (optional), the default value is 0. It will skip
some lines in the head of csv file. It will be disabled when the format is
`csv_with_names` or `csv_with_names_and_types`.
@@ -172,18 +180,52 @@ select * from s3(
"use_path_style" = "false");
```
+// MinIO
+select * from s3(
+ "uri" = "s3://bucket/file.csv",
+ "s3.endpoint" = "http://172.21.0.101:9000",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "s3.region" = "us-east-1",
+ "format" = "csv"
+);
+
Example of s3://:
```sql
// Note how to write your bucket of URI, no need to set 'use_path_style'.
// s3 will be accessed in 'virtual-hosted style'.
select * from s3(
- "URI" = "s3://bucket.endpoint/file/student.csv",
+ "URI" = "s3://bucket/file/student.csv",
+ "s3.endpoint"= "endpont",
+ "s3.region" = "region",
"s3.access_key"= "ak",
"s3.secret_key" = "sk",
"format" = "csv");
```
+Example of other uri styles:
+
+```sql
+// Virtual Host AWS Client (Hadoop S3) Mixed Style. Used by setting
`use_path_style = false` and `force_parsing_by_standard_uri = true`.
+select * from s3(
+ "URI" =
"s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="false",
+ "force_parsing_by_standard_uri"="true");
+
+// Path AWS Client (Hadoop S3) Mixed Style. Used by setting `use_path_style =
true` and `force_parsing_by_standard_uri = true`.
+select * from s3(
+ "URI" =
"s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="true",
+ "force_parsing_by_standard_uri"="true");
+```
+
**csv format**
`csv` format: Read the file on S3 and process it as a csv file, read the first
line in the file to parse out the table schema. The number of columns in the
first line of the file `n` will be used as the number of columns in the table
schema, and the column names of the table schema will be automatically named
`c1, c2, ..., cn`, and the column type is set to `String` , for example:
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]