[jira] [Commented] (FLINK-23875) ReducingUpsertSink can loose records during failover

Jark Wu (Jira) Tue, 31 Aug 2021 04:45:07 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-23875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407268#comment-17407268
 ]


Jark Wu commented on FLINK-23875:
---------------------------------

Sorry [~fabian.paul] and [~fabian.paul], but I don't understand why we need to 
snapshot the buffer. In the original design, the buffer is not needed to be 
snapshotted, because we will flush buffer before checkpoint current operator. 
If the flushing fails, the current checkpoint is failed as well. If the 
checkpoint is successful, the buffer is already written to external system 
successfully, and no need to snapshot the buffer. 

The buffer can be very large and impact the checkpoint. It's also very complex 
to re-distribute the buffer data (may result in out-of-order updating) when 
sink parallelism rescales.

>  ReducingUpsertSink can loose records during failover
> -----------------------------------------------------
>
>                 Key: FLINK-23875
>                 URL: https://issues.apache.org/jira/browse/FLINK-23875
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka, Table SQL / API
>    Affects Versions: 1.14.0
>            Reporter: Fabian Paul
>            Assignee: Fabian Paul
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.14.0
>
>
> When trying to rework the Table API Kafka connector to make it compatible 
> with the new KafkaSink I noticed that currently the buffer which is used to 
> reduce the update-before and update-after calls is not snapshotted which can 
> result in data loss if the job fails while the buffer is not empty.
>  
> Before 1.14 the equivalent class was called BufferedUpsertSinkFunction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23875) ReducingUpsertSink can loose records during failover

Reply via email to