Rohit Bobade created KAFKA-15520:
------------------------------------
Summary: Kafka Streams Stateful Aggregation Rebalancing causing
processing to pause on all partitions
Key: KAFKA-15520
URL: https://issues.apache.org/jira/browse/KAFKA-15520
Project: Kafka
Issue Type: Bug
Components: streams
Affects Versions: 2.6.2
Reporter: Rohit Bobade
Kafka broker version: 2.8.0 Kafka Streams client version: 2.6.2
I am running kafka streams stateful aggregations on K8s statefulset with
persistent volume attached to each pod. I have also specified
props.put(ConsumerConfig.GROUP_INSTANCE_ID_CONFIG, podName);
which makes sure it gets the sticky partition assignment.
Enabled standby replica - props.put(StreamsConfig.NUM_STANDBY_REPLICAS_CONFIG,
1);
and set props.put(StreamsConfig.ACCEPTABLE_RECOVERY_LAG_CONFIG, "0");
However, I'm seeing that when pods restart - it triggers rebalances and causes
processing to be paused on all pods till the rebalance and state restore is in
progress.
My understanding is that even if there is a rebalance - only the partitions
that should be moved around will be restored in a cooperative way and not pause
all the processing. Also, it should failover to standby replica in this case
and avoid state restoring on other pods.
I have increased session timeout to 480 seconds and max poll interval to 15
mins to minimize rebalances.
Also added
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
CooperativeStickyAssignor.class.getName());
to enable CooperativeStickyAssignor
could someone please help if I'm missing something?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)