[ 
https://issues.apache.org/jira/browse/KAFKA-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752846#comment-16752846
 ] 

ASF GitHub Bot commented on KAFKA-7873:
---------------------------------------

rhauch commented on pull request #6203: KAFKA-7873: Always seek to beginning in 
KafkaBasedLog
URL: https://github.com/apache/kafka/pull/6203
 
 
   Explicitly seek KafkaBasedLog’s consumer to the beginning of the topic 
partitions, rather than potentially use committed offsets (which would be 
unexpected) if group.id is set or rely upon `auto.offset.reset=earliest` if the 
group.id is null.
   
   This should not change existing behavior but should remove some potential 
issues introduced with KIP-287 if `group.id` is not set in the consumer 
configurations. Note that even if `group.id` is set, we still always want to 
consume from the beginning.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> KafkaBasedLog's consumer should always seek to beginning when starting
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-7873
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7873
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 2.1.0
>            Reporter: Randall Hauch
>            Assignee: Randall Hauch
>            Priority: Critical
>
> KafkaBasedLog expects that callers set the `group.id` for the consumer 
> configuration, and does not itself set the `group.id` if the caller does not 
> explicitly do so. However, 
> [KIP-289|https://cwiki.apache.org/confluence/display/KAFKA/KIP-289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer]
>  changed the default for the `group.id` from a blank string to be null, which 
> changes how KafkaBasedLog behaves when no `group.id` is set, and it actually 
> deprecates and issues a warning when no `group.id` is specified.
> When KafkaBasedLog starts up, it should always start from the beginning of 
> the topic and consume to the end. The consumer's logic for where to start is 
> always:
> # explicit seek
> # committed offset (skipped if group.id is null)
> # auto reset behavior
> and currently Connect does not explicitly seek to the beginning and instead 
> relies upon `auto.offset.reset=earliest`. However, if a `group.id` is 
> specified *ant* there are committed offsets, then the consumer will start 
> from the committed offsets rather than from the beginning. If a 'group.id' is 
> not specified, then the auto reset behavior should work.
> However, to avoid the warning and possible exception when no `group.id` is 
> specified, KafkaBasedLog should always call {{consumer.seekToBeginning()}} 
> during startup. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to