Thanks a lot folks, this is really helpful.

> I believe the limitation that this documentation is hinting at is the
> motivation for KIP-996

I'll make sure to check out KIP-996 and the references linked there.
Thanks for the summary as well, I really appreciate it.


Cheers,

Dani

On Tue, Feb 6, 2024 at 6:52 PM Michael K. Edwards <m.k.edwa...@gmail.com> wrote:
>
> A 5-node quorum doesn't make a lot of sense in a setting where those nodes
> are also Kafka brokers.  When they're ZooKeeper voters, a quorum* of 5
> makes a lot of sense, because you can take an unscheduled voter failure
> during a rolling-reboot scheduled maintenance without significant service
> impact.  You can also spread the ZK quorum across multiple AZs (or your
> cloud's equivalent), which I would rarely recommend doing with Kafka.
>
> The trend in Kafka development and deployment is towards KRaft, and there
> is probably no percentage in bucking that trend.  Just don't expect it to
> cover every "worst realistic case" scenario that a ZK-based deployment can.
>
> Scheduled maintenance on an (N+2 for read integrity, N+1 to stay writable)
> system adds vulnerability, and that's just something you have to build into
> your risk model.  N+1 is good enough for finely partitioned data in any use
> case that Kafka fits, because resilvering after a maintenance or a full
> broker loss is highly parallel.  N+1 is also acceptable for consumer group
> coordinator metadata, as long as you tune for aggressive compaction; I
> haven't looked at whether the coordinator code does a good job of
> parallelizing metadata replay, but if it doesn't, there's no real
> difficulty in fixing that.  For global metadata that needs globally
> serialized replay, which is what the controller metadata is, I was a lot
> happier with N+2 to stay writable.  But that's water under the bridge, and
> I'm just a spectator.
>
> Regards,
> - Michael
>
>
> * I hate this misuse of the word "quorum", but what can one do?
>
>
> On Tue, Feb 6, 2024, 8:51 AM Greg Harris <greg.har...@aiven.io.invalid>
> wrote:
>
> > Hi Dani,
> >
> > I believe the limitation that this documentation is hinting at is the
> > motivation for KIP-996 [1], and the notice in the documentation would
> > be removed once KIP-996 lands.
> > You can read the KIP for a brief explanation and link to a more
> > in-depth explanation of the failure scenario.
> >
> > While a 3-node quorum would typically be less reliable or available
> > than a 5-node quorum, it happens to be resistant to this failure mode
> > which makes the additional controllers liabilities instead of assets.
> > In the judgement of the maintainers at least, the risk of a network
> > partition which could trigger unavailability in a 5-node quorum is
> > higher than the risk of a 2-controller failure in a 3-node quorum, so
> > 3-node quorums are recommended.
> > You could do your own analysis and practical testing to make this
> > tradeoff yourself in your network context.
> >
> > I hope this helps!
> > Greg
> >
> > [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-996%3A+Pre-Vote
> >
> > On Tue, Feb 6, 2024 at 4:25 AM Daniel Saiz
> > <daniel.s...@shopify.com.invalid> wrote:
> > >
> > > Hello,
> > >
> > > I would like to clarify a statement I found in the KRaft documentation,
> > in
> > > the deployment section [1]:
> > >
> > > > More than 3 controllers is not recommended in critical environments. In
> > > the rare case of a partial network failure it is possible for the cluster
> > > metadata quorum to become unavailable. This limitation will be addressed
> > in
> > > a future release of Kafka.
> > >
> > > I would like to clarify what it's meant by that sentence, as intuitively
> > I
> > > don't see why 3 replicas would be better than 5 (or more) for fault
> > > tolerance.
> > > What is the current limitation this is referring to?
> > >
> > > Thanks a lot.
> > >
> > >
> > > Cheers,
> > >
> > > Dani
> > >
> > > [1] https://kafka.apache.org/36/documentation.html#kraft_deployment
> >

Reply via email to