Hi Paolo,

Thank you for your response! I tested out a different theory today where I
deployed the kraft controller statefulset and waited to see which brokers
would be elected as controllers.

Here is an example of my migration right after I have provisioned the kraft
controller brokers/statefulset. At this point, the brokers haven't been
restarted.

get /controller
{"version":2,"brokerid":1,"timestamp":"1710876891432","kraftControllerEpoch":-1}

get /migration
{"version":0,"kraft_metadata_offset":-1,"kraft_controller_id":-1,"kraft_metadata_epoch":-1,"kraft_controller_epoch":-1}

At this point, on a dashboard I have I see that a kafka broker is a
controller and a kraft controller broker is also a controller (although
it's not what I see in zookeeper as shown above). One thing to note is I am
doing this migration on a stretched cluster so this may alter the way that
the quorum is set up (I have three kraft controller brokers across three
regions).

After I roll the brokers, I find that this is still the case (the kraft
controller epoch has not increased in either znodes). If you don't mind
sharing, what were the steps that you followed to migrate to KRaft?

Thank you,
Sanaa

On Tue, Mar 19, 2024 at 7:05 AM Paolo Patierno <paolo.patie...@gmail.com>
wrote:

> Hi Sanaa,
> from my experiece about running migration it never happened to me and it
> should not happen anyway.
>
> When a (ZooKeeper-based) broker registers to be the controller at the
> beginning, you can see that the corresponding /controller znode will have
> an -1 as epoch.
> Something like:
>
> {"version":2,"brokerid":0,"timestamp":"1710845218527","kraftControllerEpoch":-1}
> When you deploy the KRaft quorum controller and roll the brokers to
> register and start the migration, the controller role is got by one of the
> KRaft controller and its epoch will be for sure greater than -1.
> Something
> like:
> {"version":2,"brokerid":4,"timestamp":"1710844690234","kraftControllerEpoch":10}
> A KRaft controller is able to "steal" the controller role even because its
> epoch will be for sure greater than -1.
> So during or after a migration, a broker could not get the controller role
> because only one with epoch greater than the current one can do that (and
> for a ZooKeeper-based broker it will be always -1).
> A broker can get back to be controller when you rollback migration so you
> delete the controllers, and you have to delete the /controller znode (as
> the procedure describe). Only in this case a broker is able to "win" the
> /controller by using -1 as epoch (because the /controller znode doesn't
> exist anymore).
> Not sure if in your case you made some mistakes during the migration or
> rolling the brokers.
>
> Thanks,
> Paolo.
>
>
> On Mon, 18 Mar 2024 at 21:22, Sanaa Syed <sanaa.s...@shopify.com.invalid>
> wrote:
>
> > Hello,
> >
> > I've begun migrating some of my Zookeeper Kafka clusters to KRaft. A
> > behaviour I've noticed twice across two different kafka cluster
> > environments is after provisioning a kraft controller quorum in migration
> > mode, it is possible for a kafka broker to become an active controller
> > alongside a kraft controller broker.
> >
> > For example, here are the steps I follow and the behaviour I notice (I'm
> > currently using Kafka v3.6):
> > 1. Enable the KRaft migration on the existing Kafka brokers (set the
> > `controller.quorum.voter`, `controller.listener.names` and
> > `zookeeper.metadata.migration.enable` configs in the server.properties
> > file).
> > 2. Deploy a kraft controller statefulset and service with the migration
> > enabled so that data is copied over from Zookeeper and we enter a
> > dual-write mode.
> > 3. After a few minutes, I see that the migration has completed (it's a
> > pretty small cluster). At this point, the kraft controller pod has been
> > elected to be the controller (and I see this in zookeeper when I run `get
> > /controller`). If the kafka brokers or kraft controller pods are
> restarted
> > at any point after the migration is completed, a kafka broker is elected
> to
> > be the controller and is reflected in zookeeper as well. Now, I have two
> > active controllers - 1 is a kafka broker and 1 is a kraft controller
> > broker.
> >
> > A couple questions I have:
> > 1. Is this the expected behaviour? If so, how long after a migration has
> > been completed should we hold off on restarting kafka brokers to avoid
> this
> > situation?
> > 2. Why is it possible for a kafka broker to be a controller again
> > post-migration?
> > 3. How do we come back to a state where a kraft controller broker is the
> > only controller again in the least disruptive way possible?
> >
> > Thank you,
> > Sanaa
> >
>
>
> --
> Paolo Patierno
>
> *Senior Principal Software Engineer @ Red Hat**Microsoft MVP on **Azure*
>
> Twitter : @ppatierno <http://twitter.com/ppatierno>
> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno>
> GitHub : ppatierno <https://github.com/ppatierno>
>


-- 
Sanaa Syed

Reply via email to