Hi Calvin, The `BrokerState` struct I suggested would replace the `BrokerId` field in older versions.
{ "name": "ReplicaId", "type": "int32", "versions": "0-13", "entityType": "brokerId", "about": "The broker ID of the follower, of -1 if this request is from a consumer." }, { "name": "BrokerState", "type": "BrokerState", "taggedVersions": "14+", "tag": 1, "fields": [ { "name": "BrokerId", "type": "int32", "versions": "14+", "entityType": "brokerId", "about": "The broker ID of the follower, of -1 if this request is from a consumer." }, { "name": "BrokerEpoch", "type": "int64", "versions": "14+", "about": "The epoch of this follower." } ]}, Note that the version range of `ReplicaId` is set to 0-13. Version 14 onward would not include it. -Jason On Wed, Feb 22, 2023 at 12:07 PM Calvin Liu <ca...@confluent.io.invalid> wrote: > To Jose: > 1. Actually I have a second thoughts about the naming ReplicaEpoch. The > BrokerEpoch only applies to the replication protocol between the brokers. > For others like the KRaft controller, this field can be ignored. So can we > change the name to ReplicaEpoch when we really use it in other scenarios? > > On Wed, Feb 22, 2023 at 11:08 AM Calvin Liu <ca...@confluent.io> wrote: > > > To Jason: > > 1. Related to the Fetch Request fields change, previously you suggested > > deprecating the ReplicaId and moving it into a BrokerState field. How > about > > we just make the BrokerEpoch a tag field? > > - The ReplicaId is currently in use and is filled every time. So that we > > can keep the way simple. > > - We can still make the optional BrokerEpoch out of the request when it > is > > not needed. > > > > On Tue, Feb 21, 2023 at 10:39 PM Calvin Liu <ca...@confluent.io> wrote: > > > >> To Jason: > >> 1. We can make the BrokerEpoch a tagged field. But I am not sure about > >> your proposed metadata structure. In the BrokerState, do we need to > store > >> the BrokerId again? It would duplicate with ReplicaId. > >> 2. Considering that the broker reboot data loss case is rare and Kraft > is > >> coming soon. Plus we need extra effort to > >> - Simply asking the controller to compare the epoch with its best > >> knowledge is not enough, because the ZK controller may not know the > latest > >> broker epoch, > >> - The current design only helps with the delayed AlterPartition issue > >> when the broker reboots. In ZK mode, we also need to cover the broker > >> reboot + controller reboot scenario. If the reboot broker is in ISR > >> already, the controller also crashes during the broker reboot, the new > >> controller can be completely unaware of the bounced broker and select > this > >> broker as the leader. > >> - Create a test framework to simulate the event sequence of broker > reboot > >> and registration, delayed AlterPartition request. > >> > >> To Jose: > >> 1. Thanks for the renaming advice. I will update the KIP later. > >> 2. Ack, will update. > >> > >> > >> On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio > >> <jsan...@confluent.io.invalid> wrote: > >> > >>> Hi Calvin, > >>> > >>> Thanks for the improvement. > >>> > >>> 1. In the KIP, you suggest changing the Fetch request to "Rename the > >>> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch > >>> RPC is used by replicas that are not brokers, for example controllers > >>> in KRaft. > >>> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft > >>> and ISR partitions have the concept of replica id and replica epoch > >>> but not necessarily the concept of a broker. > >>> > >>> 2. Since the new field "BrokerEpoch '' is ignorable, should it also > >>> have a default value? How about -1 since that is what you use in > >>> AlterPartittion RPC. > >>> > >>> -- > >>> -José > >>> > >> >