Hi Pushkar,

In 2.5, Kafka Streams used an assignor that tried to strike a compromise 
between stickiness and workload balance, so you would observe some stickiness, 
but not all the time.

In 2.6, we introduced the "high availability task assignor" (see KIP-441 
https://cwiki-test.apache.org/confluence/display/KAFKA/KIP-441%3A+Smooth+Scaling+Out+for+Kafka+Streams).
 This assignor is guaranteed to always assign tasks to the instance that is 
most caught up (typically, this would be the instance that was already the 
active processor, which is equivalent to stickiness). In the case of losing an 
instance (eg the pod gets replaced), any standby replica would be considered 
"most caught up" and would take over processing with very little downtime.

The new assignor achieves balance over time by "warming up" tasks in the 
background on other instances and then swaps the assignment over to them when 
they are caught up.

So, if you upgrade Streams, you should be able to configure at least one 
standby task and then be able to implement the "rolling replacement" strategy 
you described. If you are willing to wait until Streams gradually balances the 
assignment over time after each replacement, then you can cycle out the whole 
cluster without ever having downtime or developing workload skew. Note that 
there are several configuration parameters you can adjust to speed up the 
warm-up process: 
https://cwiki-test.apache.org/confluence/display/KAFKA/KIP-441%3A+Smooth+Scaling+Out+for+Kafka+Streams#KIP441:SmoothScalingOutforKafkaStreams-Parameters.

I hope this helps!
-John

On 2023/04/14 17:41:19 Pushkar Deole wrote:
> Any inputs on below query?
> 
> On Wed, Apr 12, 2023 at 2:22 PM Pushkar Deole <pdeole2...@gmail.com> wrote:
> 
> > Hi All,
> >
> > We are using version 2.5.1 of kafka-streams with 3 application instances
> > deployed as 3 kubernetes pods.
> > It consumes from multiple topics, each with 6 partitions.
> > I would like to know if streams uses sticky partition assignor strategy
> > internally since we can't set it externally on streams.
> >
> > My scenario is like this: during rolling upgrades
> > Step 1: 1 new pod comes up so there are 4 pods, with some partitions
> > assigned to newly created pod and k8s then deletes one of older pods, so it
> > is pod1, pod2, pod3 (older) and pod4 (newer). Then pod1 is deleted. So
> > ultimately pod2, pod3, pod4
> >
> > Step 2: K8s then repeats same for another old pod i.e. create a new pod
> > and then delete old pod. So pod2, pod3, pod4, pod5 and then delete pod2. So
> > ultimately pod3, pod4 and pod5
> >
> > The question I have here is: will kafka streams try to sticky with the
> > partitions assigned to newly created pods during all these rebalances i.e.
> > the partitions assigned to pod4 in step 1 will still be retained during
> > step 2 when another older pod gets deleted OR the partitions are reshuffled
> > on each rebalance whenever older pods get deleted. So during step 2, when
> > pod2 is deleted, the partitions assigned to pod4 in step 1 will also
> > reshuffle again or it will be there and any new partitions will only be
> > assigned?
> >
> >
> 

Reply via email to