Looks like the core of the problem should still be the juggling game of 
consistency, availability and partition tolerance.  If we want the cluster 
still work when brokers have inconsistent information due to network partition, 
we have to choose between consistency and availability.
My proposal is not about fix the message loss. Will share when ready.
Thanks Andrew.
________________________________
From: Andrew Grant <andrewgrant...@gmail.com>
Sent: 21 November 2023 12:35
To: dev@kafka.apache.org <dev@kafka.apache.org>
Subject: Re: How Kafka handle partition leader change?

Hey De Gao,

Message loss or duplication can actually happen even without a leadership 
change for a partition. For example if there are network issues and the 
producer never gets the ack from the server, it’ll retry and cause duplicates. 
Message loss can usually occur when you use acks=1 config - mostly you’d lose 
after a leadership change but in theory if the leader was restarted, the page 
cache was lost and it stayed leader again we could lose the message if it 
wasn’t replicated soon enough.

You might be right it’s more likely to occur during leadership change though - 
not 100% sure myself on that.

Point being, the idempotent producer really is the way to write once and only 
once as far as I’m aware.

If you have any suggestions for improvements I’m sure the community would love 
to hear them! It’s possible there are ways to make leadership changes more 
seamless and at least reduce the probability of duplicates or loss. Not sure 
myself. I’ve wondered before if the older leader could reroute messages for a 
small period of time until the client knew the new leader for example.

Andrew

Sent from my iPhone

> On Nov 21, 2023, at 1:42 AM, De Gao <d...@live.co.uk> wrote:
>
> I am asking this because I want to propose a change to Kafka. But looks like 
> in certain scenario it is very hard to not loss or duplication messages. 
> Wonder in what scenario we can accept that and where to draw the line?
>
> ________________________________
> From: De Gao <d...@live.co.uk>
> Sent: 21 November 2023 6:25
> To: dev@kafka.apache.org <dev@kafka.apache.org>
> Subject: Re: How Kafka handle partition leader change?
>
> Thanks Andrew.  Sounds like the leadership change from Kafka side is a 'best 
> effort' to avoid message duplicate or loss. Can we say that message lost is 
> very likely during leadership change unless producer uses idempotency? Is 
> this a generic situation that no intent to provide data integration guarantee 
> upon metadata change?
> ________________________________
> From: Andrew Grant <agr...@confluent.io.INVALID>
> Sent: 20 November 2023 12:26
> To: dev@kafka.apache.org <dev@kafka.apache.org>
> Subject: Re: How Kafka handle partition leader change?
>
> Hey De Gao,
>
> The controller is the one that always elects a new leader. When that happens 
> that metadata is changed on the controller and once committed it’s broadcast 
> to all brokers in the cluster. In KRaft this would be via a PartitonChange 
> record that each broker will fetch from the controller. In ZK it’d be via an 
> RPC from the controller to the broker.
>
> In either case each broker might get the notification at a different time. No 
> ordering guarantee among the brokers. But eventually they’ll all know the new 
> leader which means eventually the Produce will fail with NotLeader and the 
> client will refresh its metadata and find out the new one.
>
> In between all that leadership movement, there are various ways messages can 
> get duplicated or lost. However if you use the idempotent producer I believe 
> you actually won’t see dupes or missing messages so if that’s an important 
> requirement you could look into that. The producer is designed to retry in 
> general and when you use the idempotent producer some extra metadata is sent 
> around to dedupe any messages server-side that were sent multiple times by 
> the client.
>
> If you’re interested in learning more Kafka internals I highly recommend this 
> blog series 
> https://www.google.com/url?q=https://www.confluent.io/blog/apache-kafka-architecture-and-internals-by-jun-rao/&source=gmail-imap&ust=1701153737000000&usg=AOvVaw1Bnr9YgbxvIt1NJmgdFzn5
>
> Hope that helped a bit.
>
> Andy
>
> Sent from my iPhone
>
>> On Nov 20, 2023, at 2:07 AM, De Gao <d...@live.co.uk> wrote:
>>
>> Hi all I have a interesting question here.
>>
>> Let's say we have 2 broker B1 B2, controller C and producer P1, P2...Pn. 
>> Currently B1 holds the partition leader and Px is constantly producing 
>> messages to B1. We want to move the partition leadership to B2. How does the 
>> leadership change synced between B1, B2, C, and Px that it is guaranteed 
>> that all the parties acknowledged the leadership change in the right order? 
>> Was there a break of produce flow in between? Any chance of  message lost?
>>
>> Thanks
>>
>> De Gao

Reply via email to