This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch release-1.11
in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/release-1.11 by this push:
new e3a8a6d [FLINK-18176][document] Add supplement for file system
connector document
e3a8a6d is described below
commit e3a8a6d60555f43131379534019a0efa004cbcd4
Author: Shengkai <[email protected]>
AuthorDate: Tue Jun 9 20:42:41 2020 +0800
[FLINK-18176][document] Add supplement for file system connector document
This closes #12522
---
docs/dev/table/connectors/filesystem.md | 10 +++++++---
docs/dev/table/connectors/filesystem.zh.md | 10 +++++++---
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/docs/dev/table/connectors/filesystem.md
b/docs/dev/table/connectors/filesystem.md
index 99d9d5c..bcef270 100644
--- a/docs/dev/table/connectors/filesystem.md
+++ b/docs/dev/table/connectors/filesystem.md
@@ -48,9 +48,9 @@ CREATE TABLE MyUserTable (
'path' = 'file:///path/to/whatever', -- required: path to a directory
'format' = '...', -- required: file system connector
requires to specify a format,
-- Please refer to Table Formats
- -- section for more details.s
+ -- section for more details
'partition.default-name' = '...', -- optional: default partition name in
case the dynamic partition
- -- column value is null/empty string.
+ -- column value is null/empty string
-- optional: the option to enable shuffle data by dynamic partition fields
in sink phase, this can greatly
-- reduce the number of file for filesystem sink but may lead data skew, the
default value is false.
@@ -65,6 +65,9 @@ CREATE TABLE MyUserTable (
<span class="label label-danger">Attention</span> File system sources for
streaming is still under development. In the future, the community will add
support for common streaming use cases, i.e., partition and directory
monitoring.
+<span class="label label-danger">Attention</span> The behaviour of file system
connector is much different from `previous legacy filesystem connector`:
+the path parameter is specified for a directory not for a file and you can't
get a human-readable file in the path that you declare.
+
## Partition Files
Flink's file system partition support uses the standard hive format. However,
it does not require partitions to be pre-registered with a table catalog.
Partitions are discovered and inferred based on directory structure. For
example, a table partitioned based on the directory below would be inferred to
contain `datetime` and `hour` partitions.
@@ -137,7 +140,8 @@ a timeout that specifies the maximum duration for which a
file can be open.
**NOTE:** For bulk formats (parquet, orc, avro), the rolling policy in
combination with the checkpoint interval(pending files
become finished on the next checkpoint) control the size and number of these
parts.
-**NOTE:** For row formats (csv, json), you can reduce the time interval
appropriately to avoid too long delay.
+**NOTE:** For row formats (csv, json), you can set the parameter
`sink.rolling-policy.file-size` or `sink.rolling-policy.time-interval` in the
connector properties and parameter `execution.checkpointing.interval` in
flink-conf.yaml together
+if you don't want to wait a long period before observe the data exists in file
system. For other formats (avro, orc), you can just set parameter
`execution.checkpointing.interval` in flink-conf.yaml.
### Partition Commit
diff --git a/docs/dev/table/connectors/filesystem.zh.md
b/docs/dev/table/connectors/filesystem.zh.md
index 99d9d5c..bcef270 100644
--- a/docs/dev/table/connectors/filesystem.zh.md
+++ b/docs/dev/table/connectors/filesystem.zh.md
@@ -48,9 +48,9 @@ CREATE TABLE MyUserTable (
'path' = 'file:///path/to/whatever', -- required: path to a directory
'format' = '...', -- required: file system connector
requires to specify a format,
-- Please refer to Table Formats
- -- section for more details.s
+ -- section for more details
'partition.default-name' = '...', -- optional: default partition name in
case the dynamic partition
- -- column value is null/empty string.
+ -- column value is null/empty string
-- optional: the option to enable shuffle data by dynamic partition fields
in sink phase, this can greatly
-- reduce the number of file for filesystem sink but may lead data skew, the
default value is false.
@@ -65,6 +65,9 @@ CREATE TABLE MyUserTable (
<span class="label label-danger">Attention</span> File system sources for
streaming is still under development. In the future, the community will add
support for common streaming use cases, i.e., partition and directory
monitoring.
+<span class="label label-danger">Attention</span> The behaviour of file system
connector is much different from `previous legacy filesystem connector`:
+the path parameter is specified for a directory not for a file and you can't
get a human-readable file in the path that you declare.
+
## Partition Files
Flink's file system partition support uses the standard hive format. However,
it does not require partitions to be pre-registered with a table catalog.
Partitions are discovered and inferred based on directory structure. For
example, a table partitioned based on the directory below would be inferred to
contain `datetime` and `hour` partitions.
@@ -137,7 +140,8 @@ a timeout that specifies the maximum duration for which a
file can be open.
**NOTE:** For bulk formats (parquet, orc, avro), the rolling policy in
combination with the checkpoint interval(pending files
become finished on the next checkpoint) control the size and number of these
parts.
-**NOTE:** For row formats (csv, json), you can reduce the time interval
appropriately to avoid too long delay.
+**NOTE:** For row formats (csv, json), you can set the parameter
`sink.rolling-policy.file-size` or `sink.rolling-policy.time-interval` in the
connector properties and parameter `execution.checkpointing.interval` in
flink-conf.yaml together
+if you don't want to wait a long period before observe the data exists in file
system. For other formats (avro, orc), you can just set parameter
`execution.checkpointing.interval` in flink-conf.yaml.
### Partition Commit