hachikuji commented on code in PR #12181:
URL: https://github.com/apache/kafka/pull/12181#discussion_r893748854
##########
core/src/main/scala/kafka/cluster/Partition.scala:
##########
@@ -1571,14 +1620,26 @@ class Partition(val topicPartition: TopicPartition,
error match {
case Errors.OPERATION_NOT_ATTEMPTED =>
// Since the operation was not attempted, it is safe to reset back to
the committed state.
- partitionState = CommittedPartitionState(proposedIsrState.isr,
LeaderRecoveryState.RECOVERED)
+ partitionState = proposedIsrState.lastCommittedState
Review Comment:
Right, the controller sending the error code has not executed anything. But
what happens when you mix in retries and controller changes? It is possible
that a previous/future controller did act on the request and we did not receive
a response. I believe it works correctly for two reasons. First, we are
validating the partition epoch on the controller before checking for ineligible
replicas. So if a controller made the ISR change and then failed over before we
got a response, our retry would result in INVALID_UPDATE_VERSION. Second, the
raft layer guarantees the monotonicity of the controller epoch. In the same
scenario where we have lost a response, this ensures that we could not retry
the request against an older controller (which might still have the old
partition epoch). Anyway, my point is that this is a little subtle and depends
on external invariants.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]