yashmayya commented on code in PR #13948:
URL: https://github.com/apache/kafka/pull/13948#discussion_r1263626035


##########
connect/api/src/main/java/org/apache/kafka/connect/source/SourceTask.java:
##########
@@ -105,9 +105,11 @@ public void initialize(SourceTaskContext context) {
     public abstract List<SourceRecord> poll() throws InterruptedException;
 
     /**
-     * <p>
-     * Commit the offsets, up to the offsets that have been returned by {@link 
#poll()}. This
-     * method should block until the commit is complete.
+     * This method is invoked periodically when offsets are committed for this 
source task. Note that the offsets
+     * being committed won't necessarily correspond to the latest offsets 
returned by this source task via
+     * {@link #poll()}. When exactly-once support is disabled, offsets are 
committed periodically and asynchronously
+     * (i.e. on a separate thread from the one which calls {@link #poll()}). 
When exactly-once support is enabled,
+     * offsets are committed on transaction commits (also see {@link 
TransactionBoundary}).

Review Comment:
   > I don't love how we're outlining differences in behavior when exactly-once 
support is enabled/disabled; it adds to the cognitive load and may tempt 
connector developers to write connectors that are designed to work exclusively 
with one mode or the other.
   
   > Could it be enough to leave this bit out and rely on the "Note that the 
offsets being committed won't necessarily correspond to the latest offsets 
returned by this source task via poll" part?
   
   Yeah, sounds good 👍 
   
   It's always a delicate balance between providing enough information to be 
useful and providing so much that it's overwhelming / confusing 😅 
   
   > We can also refer people to SourceTask::commitRecord for fine-grained 
tracking of records (though there's also no guarantee that all records that 
have been ack'd in that method will have their offsets committed before a call 
to SourceTask::commit).
   
    I've added this reference and avoided mentioning offsets commit guarantees. 
I guess we could explicitly mention the lack of guarantee but again it's a 
balancing act between providing information and introducing confusion...



##########
connect/api/src/main/java/org/apache/kafka/connect/source/SourceTask.java:
##########
@@ -105,9 +105,11 @@ public void initialize(SourceTaskContext context) {
     public abstract List<SourceRecord> poll() throws InterruptedException;
 
     /**
-     * <p>
-     * Commit the offsets, up to the offsets that have been returned by {@link 
#poll()}. This
-     * method should block until the commit is complete.
+     * This method is invoked periodically when offsets are committed for this 
source task. Note that the offsets
+     * being committed won't necessarily correspond to the latest offsets 
returned by this source task via
+     * {@link #poll()}. When exactly-once support is disabled, offsets are 
committed periodically and asynchronously
+     * (i.e. on a separate thread from the one which calls {@link #poll()}). 
When exactly-once support is enabled,
+     * offsets are committed on transaction commits (also see {@link 
TransactionBoundary}).

Review Comment:
   > I don't love how we're outlining differences in behavior when exactly-once 
support is enabled/disabled; it adds to the cognitive load and may tempt 
connector developers to write connectors that are designed to work exclusively 
with one mode or the other.
   
   > Could it be enough to leave this bit out and rely on the "Note that the 
offsets being committed won't necessarily correspond to the latest offsets 
returned by this source task via poll" part?
   
   Yeah, sounds good 👍 
   
   It's always a delicate balance between providing enough information to be 
useful and providing so much that it's overwhelming / confusing 😅 
   
   > We can also refer people to SourceTask::commitRecord for fine-grained 
tracking of records (though there's also no guarantee that all records that 
have been ack'd in that method will have their offsets committed before a call 
to SourceTask::commit).
   
    I've added this reference and avoided mentioning offsets commit guarantees. 
I guess we could explicitly mention the lack of guarantee but again it's a 
balancing act between providing information and introducing confusion...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to