[ 
https://issues.apache.org/jira/browse/KAFKA-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725713#comment-17725713
 ] 

Yash Mayya commented on KAFKA-15018:
------------------------------------

Thanks for filing this bug ticket [~ChrisEgerton], it's an interesting one. 
This bug also affects regular source connectors that are configured to use a 
separate offsets topic right (and not only exactly-once support enabled source 
connectors as indicated in the ticket title / description)? 

 

I like the idea of synchronously writing tombstones to the global offsets store 
before writing them to the connector specific offsets store - it's clean and 
simple. You have a valid point about two synchronous writes to topics on 
potentially separate Kafka clusters being sub-optimal, however I also agree 
with your later point on tombstone offsets being pretty uncommon based on the 
connectors and use-cases I've encountered so far. I think it's a very 
reasonable trade-off to make (i.e. slightly worse performance for an edge case 
over potential correctness issues for the same edge case).

 

An alternate idea I had was to treat tombstones values differently from the 
absence of an offset - i.e. use a special value to represent tombstones in the 
offset store. This would allow us to distinguish between no offset being 
present in the connector specific offset store for a particular partition 
versus it being explicitly wiped by a connector task via a null / tombstone 
offset. The "special" value wouldn't be persisted in the topic, it would only 
be present in the in-memory store which represents the materialized view of the 
offset topic. However, we'd need to do some additional work to ensure that this 
value isn't leaked to connectors / tasks - basically, it should only be 
surfaced to the ConnectorOffsetBackingStore in order to make a decision on 
whether or not to use the offset from the global offsets store. I personally 
think that the other approach is cleaner overall though, WDYT?

> Potential tombstone offsets corruption for exactly-once source connectors
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-15018
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15018
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 3.3.0, 3.4.0, 3.3.1, 3.3.2, 3.5.0, 3.4.1
>            Reporter: Chris Egerton
>            Priority: Major
>
> When exactly-once support is enabled for source connectors, source offsets 
> can potentially be written to two different offsets topics: a topic specific 
> to the connector, and the global offsets topic (which was used for all 
> connectors prior to KIP-618 / version 3.3.0).
> Precedence is given to offsets in the per-connector offsets topic, but if 
> none are found for a given partition, then the global offsets topic is used 
> as a fallback.
> When committing offsets, a transaction is used to ensure that source records 
> and source offsets are written to the Kafka cluster targeted by the source 
> connector. This transaction only includes the connector-specific offsets 
> topic. Writes to the global offsets topic take place after writes to the 
> connector-specific offsets topic have completed successfully, and if they 
> fail, a warning message is logged, but no other action is taken.
> Normally, this ensures that, for offsets committed by exactly-once-supported 
> source connectors, the per-connector offsets topic is at least as up-to-date 
> as the global offsets topic, and sometimes even ahead.
> However, for tombstone offsets, we lose that guarantee. If a tombstone offset 
> is successfully written to the per-connector offsets topic, but cannot be 
> written to the global offsets topic, then the global offsets topic will still 
> contain that source offset, but the per-connector topic will not. Due to the 
> fallback-on-global logic used by the worker, if a task requests offsets for 
> one of the tombstoned partitions, the worker will provide it with the offsets 
> present in the global offsets topic, instead of indicating to the task that 
> no offsets can be found.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to