I guess you wpuld get duplicates of you crash after data was written
into the topics but before offsets were committed.

So there is no data-loss nor re-ordering for this case, but duplication.


-Matthias

On 1/28/21 11:20 AM, nitin agarwal wrote:
> Hi,
> 
> By committing the offsets, I meant tracking the progress of how much data
> is read from the upstream system. In Kafka Connect this is being referred
> as committing the offsets.
> This is the method I was talking about
> https://github.com/a0x8o/kafka/blob/master/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L462-L567
> 
> My doubt is that what if the connector gets restarted or the node on which
> connector is running goes down just before flushing the offsets
> <https://github.com/a0x8o/kafka/blob/master/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L521>
> .
> 
> Thank you,
> Nitin
> 
> 
> 
> On Thu, Jan 28, 2021 at 9:54 PM Matthias J. Sax <mj...@apache.org> wrote:
> 
>> I don't know all details of Connect...
>>
>> However, not sure what you mean by "committing offsets"?
>>
>> A source connector takes data from an external data source and writes it
>> into a Kafka topic. Thus, there should not be any offsets to be
>> committed. (Committing offsets only applies if you read from a topic.)
>>
>> Instead, the "progress" how much data from the upstream system is read
>> needs to be tracked. If done right (what I assume Connect does -- not
>> sure if there might be a concrete connector dependency?) there should
>> not be out-of-order data.
>>
>> But I hope that some Connect expert can chime in...
>>
>>
>> -Matthias
>>
>> On 1/28/21 12:24 AM, nitin agarwal wrote:
>>> Assuming the configurations are as follows:
>>> max.inflight.requests.per.connection=1
>>> enable.idempotence=false
>>>
>>> Thanks,
>>> Nitin
>>>
>>>
>>> On Thu, Jan 28, 2021 at 1:53 PM nitin agarwal <nitingarg...@gmail.com>
>>> wrote:
>>>
>>>> Thanks for quick reply, I have understood this behaviour now.
>>>> I have another follow up question.
>>>>
>>>> Can the Source connector write out of order messages in a case where
>> there
>>>> is a failure in committing the offset and the connector is restarted at
>> the
>>>> same time?
>>>>
>>>> Thanks,
>>>> Nitin
>>>>
>>>> On Thu, Jan 28, 2021 at 8:06 AM Matthias J. Sax <mj...@apache.org>
>> wrote:
>>>>
>>>>> There should not be any data loss.
>>>>>
>>>>> However, if a request fails and is retried, it may lead to reordering
>> of
>>>>> sends. Thus, records would not be ordered based on the `send()` calls
>>>>> any longer.
>>>>>
>>>>> If you would enable idempotent writes, ordering is guaranteed even with
>>>>> multiple in-flight requests per connection though.
>>>>>
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>> On 1/27/21 11:35 AM, nitin agarwal wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> I see that max.inflight.requests.per.connection is set to 1 explicitly
>>>>> in
>>>>>> Kafka Connect but there is a way to override it. I want to understand
>>>>> the
>>>>>> impact of setting its value > 1.
>>>>>> As per my understanding, it will lead to data loss in some cases. Is
>> it
>>>>>> correct ?
>>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>> Nitin
>>>>>>
>>>>>
>>>>
>>>
>>
> 

Reply via email to