hachikuji commented on a change in pull request #9441: URL: https://github.com/apache/kafka/pull/9441#discussion_r643513632
########## File path: core/src/main/scala/kafka/server/KafkaApis.scala ########## @@ -279,30 +279,33 @@ class KafkaApis(val requestChannel: RequestChannel, new StopReplicaResponseData().setErrorCode(Errors.STALE_BROKER_EPOCH.code))) } else { val partitionStates = stopReplicaRequest.partitionStates().asScala - val (result, error) = replicaManager.stopReplicas( - request.context.correlationId, - stopReplicaRequest.controllerId, - stopReplicaRequest.controllerEpoch, - stopReplicaRequest.brokerEpoch, - partitionStates) - // Clear the coordinator caches in case we were the leader. In the case of a reassignment, we - // cannot rely on the LeaderAndIsr API for this since it is only sent to active replicas. - result.forKeyValue { (topicPartition, error) => - if (error == Errors.NONE) { - if (topicPartition.topic == GROUP_METADATA_TOPIC_NAME - && partitionStates(topicPartition).deletePartition) { - groupCoordinator.onResignation(topicPartition.partition) - } else if (topicPartition.topic == TRANSACTION_STATE_TOPIC_NAME - && partitionStates(topicPartition).deletePartition) { + def onStopReplicas(error: Errors, partitions: Map[TopicPartition, Errors]): Unit = { + // Clear the coordinator caches in case we were the leader. In the case of a reassignment, we + // cannot rely on the LeaderAndIsr API for this since it is only sent to active replicas. + partitions.forKeyValue { (topicPartition, partitionError) => + if (partitionError == Errors.NONE) { val partitionState = partitionStates(topicPartition) val leaderEpoch = if (partitionState.leaderEpoch >= 0) - Some(partitionState.leaderEpoch) + Some(partitionState.leaderEpoch) else None - txnCoordinator.onResignation(topicPartition.partition, coordinatorEpoch = leaderEpoch) + if (topicPartition.topic == GROUP_METADATA_TOPIC_NAME + && partitionState.deletePartition) { + groupCoordinator.onResignation(topicPartition.partition, leaderEpoch) + } else if (topicPartition.topic == TRANSACTION_STATE_TOPIC_NAME + && partitionState.deletePartition) { + txnCoordinator.onResignation(topicPartition.partition, coordinatorEpoch = leaderEpoch) + } } } } + val (result, error) = replicaManager.stopReplicas( + request.context.correlationId, + stopReplicaRequest.controllerId, + stopReplicaRequest.controllerEpoch, + stopReplicaRequest.brokerEpoch, + partitionStates, + onStopReplicas) Review comment: If I understand correctly, the original issue concerned the potential reordering of loading/unloading events. This was possible because of inconsistent locking and the fact that we relied 100% on the order that the task was submitted to the scheduler. With this patch, we are now using the leader epoch in order to ensure that loading/unloading events are handled in the correct order. This means it does not actually matter if the events get submitted to the scheduler in the wrong order. Does that make sense or am I still missing something? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org