[
https://issues.apache.org/jira/browse/KAFKA-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Gustafson updated KAFKA-13858:
------------------------------------
Description:
When the kraft broker begins controlled shutdown, it immediately disables the
metadata listener. This means that metadata changes as part of the controlled
shutdown do not get sent to the respective components. For partitions that the
broker is follower of, that is what we want. It prevents the follower from
being able to rejoin the ISR while still shutting down. But for partitions that
the broker is leading, it means the leader will remain active until controlled
shutdown finishes and the socket server is stopped. That delay can be as much
as 5 seconds and probably even worse.
In the zk world, we have an explicit request `StopReplica` which serves the
purpose of shutting down both follower and leader, but we don't have something
similar in kraft. For KRaft, we may not necessarily need an explicit signal
like this. We know that the broker is shutting down, so we can treat partition
changes as implicit `StopReplica` requests rather than going through the normal
`LeaderAndIsr` flow.
was:
When the kraft broker begins controlled shutdown, it immediately disables the
metadata listener. This means that metadata changes as part of the controlled
shutdown do not get sent to the respective components. For partitions that the
broker is follower of, that is what we want. It prevents the follower from
being able to rejoin the ISR while still shutting down. But for partitions that
the broker is leading, it means the leader will remain active until controlled
shutdown is complete.
In the zk world, we have an explicit request `StopReplica` which serves the
purpose of shutting down both follower and leader, but we don't have something
similar in kraft. For KRaft, we may not necessarily need an explicit signal
like this. We know that the broker is shutting down, so we can treat partition
changes as implicit `StopReplica` requests rather than going through the normal
`LeaderAndIsr` flow.
> Kraft should not shutdown metadata listener until controller shutdown is
> finished
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-13858
> URL: https://issues.apache.org/jira/browse/KAFKA-13858
> Project: Kafka
> Issue Type: Bug
> Reporter: Jason Gustafson
> Assignee: Jason Gustafson
> Priority: Major
>
> When the kraft broker begins controlled shutdown, it immediately disables the
> metadata listener. This means that metadata changes as part of the controlled
> shutdown do not get sent to the respective components. For partitions that
> the broker is follower of, that is what we want. It prevents the follower
> from being able to rejoin the ISR while still shutting down. But for
> partitions that the broker is leading, it means the leader will remain active
> until controlled shutdown finishes and the socket server is stopped. That
> delay can be as much as 5 seconds and probably even worse.
> In the zk world, we have an explicit request `StopReplica` which serves the
> purpose of shutting down both follower and leader, but we don't have
> something similar in kraft. For KRaft, we may not necessarily need an
> explicit signal like this. We know that the broker is shutting down, so we
> can treat partition changes as implicit `StopReplica` requests rather than
> going through the normal `LeaderAndIsr` flow.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)