Hi Amit, Can you elaborate how you write using "S3 sink" and which version of Flink you are using?
If you are using BucketingSink[1], you can checkout the API doc and configure to flush before closing your sink. This way your sink is "integrated with the checkpointing mechanism to provide exactly once semantics"[2] Thanks, Rong [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/filesystem_sink.html [2] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html On Thu, May 17, 2018 at 2:57 AM, Amit Jain <aj201...@gmail.com> wrote: > Hi, > > We are using Flink to process click stream data from Kafka and pushing > the same in 128MB file in S3. > > What is the message processing guarantees with S3 sink? In my > understanding, S3A client buffers the data on memory/disk. In failure > scenario on particular node, TM would not trigger Writer#close hence > buffered data can lose entirely assuming this buffer contains data of > last successful checkpointing. > > -- > Thanks, > Amit >