[jira] [Commented] (FLINK-20918) Avoid excessive flush of Hadoop output stream

Yun Gao (Jira) Wed, 20 Jan 2021 02:45:06 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-20918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268505#comment-17268505
 ]


Yun Gao commented on FLINK-20918:
---------------------------------

Hi [~Paul Lin], very thanks for opening the issue! One concern to me is that 
could we ensure that in all implementations we have `hsync` is an enhanced 
version of `hflush` ? I'm ask so since I think there might be some other 
FileSystem or Object Store provide hadoop compatible FileSystems, thus is it 
possible that the change might cause different behaviors for some users ?

> Avoid excessive flush of Hadoop output stream
> ---------------------------------------------
>
>                 Key: FLINK-20918
>                 URL: https://issues.apache.org/jira/browse/FLINK-20918
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Hadoop Compatibility, FileSystems
>    Affects Versions: 1.12.0, 1.11.3
>            Reporter: Paul Lin
>            Priority: Major
>              Labels: pull-request-available
>
> [HadoopRecoverableFsDataOutputStream#sync|https://github.com/apache/flink/blob/67d167ccd45046fc5ed222ac1f1e3ba5e6ec434b/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L123]
>  calls both `hflush` and `hsync`, whereas `hsync` is an enhanced version of 
> `hflush`. We should remove the `hflush` call to avoid the excessive flush.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-20918) Avoid excessive flush of Hadoop output stream

Reply via email to