[
https://issues.apache.org/jira/browse/FLINK-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880120#comment-15880120
]
ASF GitHub Bot commented on FLINK-5487:
---------------------------------------
Github user tzulitai commented on a diff in the pull request:
https://github.com/apache/flink/pull/3358#discussion_r102662136
--- Diff:
flink-connectors/flink-connector-elasticsearch-base/src/main/java/org/apache/flink/streaming/connectors/elasticsearch/ElasticsearchSinkBase.java
---
@@ -211,6 +283,23 @@ public void invoke(T value) throws Exception {
}
@Override
+ public void initializeState(FunctionInitializationContext context)
throws Exception {
+ // no initialization needed
+ }
+
+ @Override
+ public void snapshotState(FunctionSnapshotContext context) throws
Exception {
+ checkErrorAndRethrow();
+
+ if (flushOnCheckpoint) {
+ do {
+ bulkProcessor.flush();
--- End diff --
Following my arguments above, I think the busy loop you mentioned shouldn't
happen, because bulk processor's internal `bulkRequest.numberOfActions()`
should always be synced with our `numPendingRecords`. (i.e., it should not
occur that `bulkRequest.numberOfActions() == 0` but our own `numPendingRecords
!= 0`).
So in that case, if `bulkRequest.numberOfActions() == 0` then my original
loop implementation just fallbacks to a single pass with 2 condition checks.
To a certain extent, I think it might be better to stick to the original
loop implementation, so that we're not locked-in with how the `BulkProcessor`'s
flush is implemented. As you can see from a commit I just pushed (2956f99)
which modifies the mock bulk processor in tests to correctly mimic the flushing
behaviour I described above, the loop implementation still pass the tests.
> Proper at-least-once support for ElasticsearchSink
> --------------------------------------------------
>
> Key: FLINK-5487
> URL: https://issues.apache.org/jira/browse/FLINK-5487
> Project: Flink
> Issue Type: Bug
> Components: Streaming Connectors
> Reporter: Tzu-Li (Gordon) Tai
> Assignee: Tzu-Li (Gordon) Tai
> Priority: Critical
>
> Discussion in ML:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fault-tolerance-guarantees-of-Elasticsearch-sink-in-flink-elasticsearch2-td10982.html
> Currently, the Elasticsearch Sink actually doesn't offer any guarantees for
> message delivery.
> For proper support of at-least-once, the sink will need to participate in
> Flink's checkpointing: when snapshotting is triggered at the
> {{ElasticsearchSink}}, we need to synchronize on the pending ES requests by
> flushing the internal bulk processor. For temporary ES failures (see
> FLINK-5122) that may happen on the flush, we should retry them before
> returning from snapshotting and acking the checkpoint. If there are
> non-temporary ES failures on the flush, the current snapshot should fail.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)