[ https://issues.apache.org/jira/browse/KAFKA-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax resolved KAFKA-17380. ------------------------------------- Resolution: Fixed Resolving this a "fixed" base on the last reply, and given that affects version is 2.x... > Kafka Streams few partition stuck in processing - fixed after restart > --------------------------------------------------------------------- > > Key: KAFKA-17380 > URL: https://issues.apache.org/jira/browse/KAFKA-17380 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 2.6.2 > Reporter: Rohit Bobade > Priority: Major > > Using Kafka Streams 2.6.2 and running stateful aggregations with Exactly once > semantics. > The processing logic is: > consume input records -> intermediate aggregate and buffer data in state > store backed by change log topic -> punctuate every 15seconds - flush state > store and send aggregated records downstream -> final aggregate operation and > send to output topic > Since we use spot instances, one of the pod got restarted and rebalance was > triggered and state was getting restored from changelog topic. > we noticed ProducerFenced exceptions: > {quote}org.apache.kafka.common.errors.ProducerFencedException: Producer > attempted an > operation with an old epoch. Either there is a newer producer with the same > transactionalId, or the producer's transaction has been expired by the broker. > {quote} > After this a few partitions were stuck and no records were processed util we > restarted the application. > We had configured: > > transaction.timeout.ms to 30 seconds > session.timeout.ms to 30 seconds > could you please advise if there's any known fix for this edge case? -- This message was sent by Atlassian Jira (v8.20.10#820010)