[flink] branch release-1.11 updated: [FLINK-18176][document] Add supplement for file system connector document

lzljs3620320 Tue, 09 Jun 2020 05:44:10 -0700

This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch release-1.11
in repository https://gitbox.apache.org/repos/asf/flink.git



The following commit(s) were added to refs/heads/release-1.11 by this push:
     new e3a8a6d  [FLINK-18176][document] Add supplement for file system 
connector document
e3a8a6d is described below

commit e3a8a6d60555f43131379534019a0efa004cbcd4
Author: Shengkai <[email protected]>
AuthorDate: Tue Jun 9 20:42:41 2020 +0800

    [FLINK-18176][document] Add supplement for file system connector document
    
    
    This closes #12522
---
 docs/dev/table/connectors/filesystem.md    | 10 +++++++---
 docs/dev/table/connectors/filesystem.zh.md | 10 +++++++---
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/dev/table/connectors/filesystem.md 
b/docs/dev/table/connectors/filesystem.md
index 99d9d5c..bcef270 100644
--- a/docs/dev/table/connectors/filesystem.md
+++ b/docs/dev/table/connectors/filesystem.md
@@ -48,9 +48,9 @@ CREATE TABLE MyUserTable (
   'path' = 'file:///path/to/whatever',  -- required: path to a directory
   'format' = '...',                     -- required: file system connector 
requires to specify a format,
                                         -- Please refer to Table Formats
-                                        -- section for more details.s
+                                        -- section for more details
   'partition.default-name' = '...',     -- optional: default partition name in 
case the dynamic partition
-                                        -- column value is null/empty string.
+                                        -- column value is null/empty string
   
   -- optional: the option to enable shuffle data by dynamic partition fields 
in sink phase, this can greatly
   -- reduce the number of file for filesystem sink but may lead data skew, the 
default value is false.
@@ -65,6 +65,9 @@ CREATE TABLE MyUserTable (
 
 <span class="label label-danger">Attention</span> File system sources for 
streaming is still under development. In the future, the community will add 
support for common streaming use cases, i.e., partition and directory 
monitoring.
 
+<span class="label label-danger">Attention</span> The behaviour of file system 
connector is much different from `previous legacy filesystem connector`: 
+the path parameter is specified for a directory not for a file and you can't 
get a human-readable file in the path that you declare.
+
 ## Partition Files
 
 Flink's file system partition support uses the standard hive format. However, 
it does not require partitions to be pre-registered with a table catalog. 
Partitions are discovered and inferred based on directory structure. For 
example, a table partitioned based on the directory below would be inferred to 
contain `datetime` and `hour` partitions.
@@ -137,7 +140,8 @@ a timeout that specifies the maximum duration for which a 
file can be open.
 **NOTE:** For bulk formats (parquet, orc, avro), the rolling policy in 
combination with the checkpoint interval(pending files
 become finished on the next checkpoint) control the size and number of these 
parts.
 
-**NOTE:** For row formats (csv, json), you can reduce the time interval 
appropriately to avoid too long delay.
+**NOTE:** For row formats (csv, json), you can set the parameter 
`sink.rolling-policy.file-size` or `sink.rolling-policy.time-interval` in the 
connector properties and parameter `execution.checkpointing.interval` in 
flink-conf.yaml together
+if you don't want to wait a long period before observe the data exists in file 
system. For other formats (avro, orc), you can just set parameter 
`execution.checkpointing.interval` in flink-conf.yaml.
 
 ### Partition Commit
 
diff --git a/docs/dev/table/connectors/filesystem.zh.md 
b/docs/dev/table/connectors/filesystem.zh.md
index 99d9d5c..bcef270 100644
--- a/docs/dev/table/connectors/filesystem.zh.md
+++ b/docs/dev/table/connectors/filesystem.zh.md
@@ -48,9 +48,9 @@ CREATE TABLE MyUserTable (
   'path' = 'file:///path/to/whatever',  -- required: path to a directory
   'format' = '...',                     -- required: file system connector 
requires to specify a format,
                                         -- Please refer to Table Formats
-                                        -- section for more details.s
+                                        -- section for more details
   'partition.default-name' = '...',     -- optional: default partition name in 
case the dynamic partition
-                                        -- column value is null/empty string.
+                                        -- column value is null/empty string
   
   -- optional: the option to enable shuffle data by dynamic partition fields 
in sink phase, this can greatly
   -- reduce the number of file for filesystem sink but may lead data skew, the 
default value is false.
@@ -65,6 +65,9 @@ CREATE TABLE MyUserTable (
 
 <span class="label label-danger">Attention</span> File system sources for 
streaming is still under development. In the future, the community will add 
support for common streaming use cases, i.e., partition and directory 
monitoring.
 
+<span class="label label-danger">Attention</span> The behaviour of file system 
connector is much different from `previous legacy filesystem connector`: 
+the path parameter is specified for a directory not for a file and you can't 
get a human-readable file in the path that you declare.
+
 ## Partition Files
 
 Flink's file system partition support uses the standard hive format. However, 
it does not require partitions to be pre-registered with a table catalog. 
Partitions are discovered and inferred based on directory structure. For 
example, a table partitioned based on the directory below would be inferred to 
contain `datetime` and `hour` partitions.
@@ -137,7 +140,8 @@ a timeout that specifies the maximum duration for which a 
file can be open.
 **NOTE:** For bulk formats (parquet, orc, avro), the rolling policy in 
combination with the checkpoint interval(pending files
 become finished on the next checkpoint) control the size and number of these 
parts.
 
-**NOTE:** For row formats (csv, json), you can reduce the time interval 
appropriately to avoid too long delay.
+**NOTE:** For row formats (csv, json), you can set the parameter 
`sink.rolling-policy.file-size` or `sink.rolling-policy.time-interval` in the 
connector properties and parameter `execution.checkpointing.interval` in 
flink-conf.yaml together
+if you don't want to wait a long period before observe the data exists in file 
system. For other formats (avro, orc), you can just set parameter 
`execution.checkpointing.interval` in flink-conf.yaml.
 
 ### Partition Commit

[flink] branch release-1.11 updated: [FLINK-18176][document] Add supplement for file system connector document

Reply via email to