C0urante commented on PR #14158:
URL: https://github.com/apache/kafka/pull/14158#issuecomment-1668229509

   I've been scratching my head over this one for a bit. One one hand, it's 
nice to allow heavily-filtered source connectors to record progress (and this 
was a suggestion I made to address part of the motivation for 
[KIP-910](https://cwiki.apache.org/confluence/display/KAFKA/KIP-910%3A+Update+Source+offsets+for+Source+Connectors+without+producing+records))
 so that there are fewer duplicates if one is restarted.
   
   However, the current behavior when exactly-once support is disabled also has 
some benefits. Right now it's possible to write an SMT that does batching of 
many source records into a single Kafka record.
   
   I'm also curious--what's the behavior with sink connectors when records are 
filtered via SMT? Does this vary depending on whether the connector's task 
class overrides the `SinkTask::preCommit` method?
   
   @vamossagar12 Ultimately I agree that some work probably has to be done 
around this logic, and thanks for identifying the discrepancy. I'm just not 
certain that the decision I made to commit offsets for dropped records when 
working on exactly-once source connectors was the correct one, and think we 
should at least consider reverting that change in behavior rather than updating 
other, longer-existing modes to align with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to