Hello,

I’m trying to understand the guarantees made by Flink’s Elasticsearch sink in 
terms of message delivery. according to (1), the ES sink offers at-least-once 
guarantees. This page doesn’t differentiate between flink-elasticsearch and 
flink-elasticsearch2, so I have to assume for the moment that they both offer 
that guarantee. However, a look at the code (2) shows that the invoke() method 
puts the record into a buffer, and then that buffer is flushed to elasticsearch 
some time later.

It’s my understanding that Flink uses checkpoint “records” flowing past the 
sink as a means for forming the guarantee that all records prior to the 
checkpoint have been received by the sink.  I assume that the invoke() method 
returning is what Flink uses to decide if a record has passed a sink, but here 
invoke stashes in a buffer that doesn’t look like it participates in 
checkpointing anywhere.

Does the sink provided in link-connector-elasticsearch2 guarantee 
at-least-once, and if it does, how does it reconstitute the buffer (so as to 
not lose records that have gone through the sink’s invoke() method, but not 
been transmitted to ES yet) in the case of the operator failing when the buffer 
is not empty?


(1) 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/guarantees.html
 
<https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/guarantees.html>
(2) 
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-elasticsearch2/src/main/java/org/apache/flink/streaming/connectors/elasticsearch2/ElasticsearchSink.java
 
<https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-elasticsearch2/src/main/java/org/apache/flink/streaming/connectors/elasticsearch2/ElasticsearchSink.java>

Reply via email to