ASF GitHub Bot commented on KAFKA-9558:

skaundinya15 commented on pull request #8119: KAFKA-9558: Fixing retry logic 
for getListOffsetsCalls
URL: https://github.com/apache/kafka/pull/8119
   This PR is to fix the retry logic for `getListOffsetsCalls`. Previously, if 
there were partitions with errors, it would only pass in the current call 
object to retry after a metadata refresh. However this is incorrect as if 
there's a leader change, the call object never gets updated with the correct 
leader node to query. This PR fixes this by making another call to 
`getListOffsetsCalls` with only the error topic partitions as the next calls to 
be made after the metadata refresh. In addition there is an additional test to 
test the scenario where a leader change occurs.
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

> getListOffsetsCalls doesn't update node in case of leader change
> ----------------------------------------------------------------
>                 Key: KAFKA-9558
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9558
>             Project: Kafka
>          Issue Type: Bug
>          Components: admin
>    Affects Versions: 2.5.0
>            Reporter: Sanjana Kaundinya
>            Assignee: Sanjana Kaundinya
>            Priority: Critical
> As seen here:
> [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L3810]
> In handling the response in the `listOffsets` call, if there are errors in 
> the topic partition that require a metadata refresh, it simply passes the 
> call object as `this`. This produces incorrect behavior if there was a leader 
> change, because the call object never gets its leader node updated. This will 
> result in a tight loop of list offsets being called to the same old leader 
> and not resulting in offsets, even though the metadata was correctly updated.

This message was sent by Atlassian Jira

Reply via email to