[
https://issues.apache.org/jira/browse/KAFKA-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke Chen updated KAFKA-14010:
------------------------------
Description:
When submitting the AlterIsr request, we register a future listener to handle
the response
[here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L1585-L1610].
When receiving retriable error, we expected the AlterIsr request will get
retried. And then, we'll re-submit the request again.
However, before the future listener got called, we didn't clear the
`unsentIsrUpdates`, which causes we failed to "enqueue" the request because we
thought there's an in-flight request. We use "try/finally" to make sure the
unsentIsrUpdates got cleared
([here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/AlterPartitionManager.scala#L362-L370]),
but it happened "after" we retry the request
Although the AlterIsr request will get sent next time when the follower sent
fetch request to the leader, we still need to fix this issue to make sure the
AlterIsr request is sent successfully as we expected.
was:
When submitting the AlterIsr request, we register a future listener to handle
the response
[here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L1585-L1610].
When receiving retriable error, we expected the AlterIsr request will get
retried. And then, we'll re-submit the request again.
However, before the future listener got called, we didn't clear the
`unsentIsrUpdates`, which causes we failed to "enqueue" the request because we
thought there's an in-flight request. We use "try/finally" to make sure the
unsentIsrUpdates got cleared
([here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/AlterPartitionManager.scala#L362-L370],
but it happened "after" we retry the request
Although the AlterIsr request will get sent next time when the follower sent
fetch request to the leader, we still need to fix this issue to make sure the
AlterIsr request is sent successfully as we expected.
> alterISR request won't retry when receiving retriable error
> -----------------------------------------------------------
>
> Key: KAFKA-14010
> URL: https://issues.apache.org/jira/browse/KAFKA-14010
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 3.2.0
> Reporter: Luke Chen
> Assignee: Luke Chen
> Priority: Major
>
> When submitting the AlterIsr request, we register a future listener to handle
> the response
> [here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L1585-L1610].
> When receiving retriable error, we expected the AlterIsr request will get
> retried. And then, we'll re-submit the request again.
> However, before the future listener got called, we didn't clear the
> `unsentIsrUpdates`, which causes we failed to "enqueue" the request because
> we thought there's an in-flight request. We use "try/finally" to make sure
> the unsentIsrUpdates got cleared
> ([here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/AlterPartitionManager.scala#L362-L370]),
> but it happened "after" we retry the request
> Although the AlterIsr request will get sent next time when the follower sent
> fetch request to the leader, we still need to fix this issue to make sure the
> AlterIsr request is sent successfully as we expected.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)