[jira] [Comment Edited] (KAFKA-5696) SourceConnector does not commit offset on rebalance

Randall Hauch (JIRA) Mon, 07 Aug 2017 07:30:23 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116650#comment-16116650
 ]


Randall Hauch edited comment on KAFKA-5696 at 8/7/17 2:29 PM:
--------------------------------------------------------------

Kafka Connect currently guarantees *at least once*, for a couple of reasons.

First, Kafka Connect currently commits offsets *periodically* (configurable via 
worker's {{offset.flush.interval.ms}} property that defaults to 60 seconds, and 
Kafka Connect will always restart from the last *committed offset*. The most 
recent offsets are always committed just prior to graceful shutdown (and should 
be just prior to a rebalance, per this issue), but any unexpected failure will 
likely mean that the last the connector has produced records that were written 
to Kafka but not yet reflected in committed offsets. 

Second, when the Kafka Connect worker writes source records to Kafka, the 
broker may successfully accept and write a batch of source records to the 
specified number of workers and send an acknowledgement. However, if a network 
glitch prevents the Kafka Connect worker's producer from receiving the broker's 
acknowledgement, the producer will resend the batch of source records. 

Bottom line, there is always a chance of the connector producing records that 
are written to Kafka but not reflected in committed offsets. 


was (Author: rhauch):
Kafka Connect currently guarantees *at least once*, for a couple of reasons.

First, Kafka Connect currently commits offsets *periodically* (configurable via 
worker's {{offset.flush.interval.ms}} property that defaults to 60 seconds, and 
Kafka Connect will always restart from the last *committed offset*. The most 
recent offsets are always committed just prior to graceful shutdown (and should 
be just prior to a rebalance, per this issue), but any unexpected failure will 
likely mean that the last the connector has produced records that were written 
to Kafka but not reflected in committed offsets. 

Second, when the Kafka Connect worker writes source records to Kafka, the 
broker may successfully accept and write a batch of source records to the 
specified number of workers and send an acknowledgement. However, if a network 
glitch prevents the Kafka Connect worker's producer from receiving the broker's 
acknowledgement, the producer will resend the batch of source records. 

Bottom line, there is always a chance of the connector producing records that 
are written to Kafka but not reflected in committed offsets. 

> SourceConnector does not commit offset on rebalance
> ---------------------------------------------------
>
>                 Key: KAFKA-5696
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5696
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>            Reporter: Oleg Kuznetsov
>              Labels: newbie
>             Fix For: 0.10.0.2
>
>
> I'm running SourceConnector, that reads files from storage and put data in 
> kafka. I want, in case of reconfiguration, offsets to be flushed. 
> Say, a file is completely processed, but source records are not yet committed 
> and in case of reconfiguration their offsets might be missing in store.
> Is it possible to force committing offsets on reconfiguration?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (KAFKA-5696) SourceConnector does not commit offset on rebalance

Reply via email to