Can you elaborate how you write using "S3 sink" and which version of Flink
you are using?
If you are using BucketingSink, you can checkout the API doc and
configure to flush before closing your sink.
This way your sink is "integrated with the checkpointing mechanism to
provide exactly once semantics"
On Thu, May 17, 2018 at 2:57 AM, Amit Jain <aj201...@gmail.com> wrote:
> We are using Flink to process click stream data from Kafka and pushing
> the same in 128MB file in S3.
> What is the message processing guarantees with S3 sink? In my
> understanding, S3A client buffers the data on memory/disk. In failure
> scenario on particular node, TM would not trigger Writer#close hence
> buffered data can lose entirely assuming this buffer contains data of
> last successful checkpointing.