[ https://issues.apache.org/jira/browse/FLINK-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428287#comment-16428287 ]
ASF GitHub Bot commented on FLINK-9113: --------------------------------------- Github user twalthr commented on the issue: https://github.com/apache/flink/pull/5811 Thanks for looking into it @kl0u. I observed the same behavior during debugging. I will remove the check for now and open a follow up issue. If there is no better solution, we might need to close the writer for checkpoints on local filesystems for preventing data loss in cases where the OS/machine goes down. > Data loss in BucketingSink when writing to local filesystem > ----------------------------------------------------------- > > Key: FLINK-9113 > URL: https://issues.apache.org/jira/browse/FLINK-9113 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors > Reporter: Timo Walther > Assignee: Timo Walther > Priority: Major > > This issue is closely related to FLINK-7737. By default the bucketing sink > uses HDFS's {{org.apache.hadoop.fs.FSDataOutputStream#hflush}} for > performance reasons. However, this leads to data loss in case of TaskManager > failures when writing to a local filesystem > {{org.apache.hadoop.fs.LocalFileSystem}}. We should use {{hsync}} by default > in local filesystem cases and make it possible to disable this behavior if > needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)