[ https://issues.apache.org/jira/browse/KAFKA-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhanxiang (Patrick) Huang reassigned KAFKA-8202: ------------------------------------------------ Assignee: Zhanxiang (Patrick) Huang > StackOverflowError on producer when splitting batches > ----------------------------------------------------- > > Key: KAFKA-8202 > URL: https://issues.apache.org/jira/browse/KAFKA-8202 > Project: Kafka > Issue Type: Bug > Affects Versions: 2.0.0 > Reporter: Daniel Krawczyk > Assignee: Zhanxiang (Patrick) Huang > Priority: Major > > Hello, > recently we came across a StackOverflowError error in the Kafka producer java > library. The error caused the Kafka producer to stop (we had to restart our > service due to: IllegalStateException: Cannot perform operation after > producer has been closed). > The stack trace was as follows: > {code:java} > java.lang.StackOverflowError: null > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.chain(FutureRecordMetadata.java:89) > // […] > {code} > The piece of code responsible for the error: > {code:java} > /** > * This method is used when we have to split a large batch in smaller ones. A > chained metadata will allow the > * future that has already returned to the users to wait on the newly created > split batches even after the > * old big batch has been deemed as done. > */ > void chain(FutureRecordMetadata futureRecordMetadata) { > if (nextRecordMetadata == null) > nextRecordMetadata = futureRecordMetadata; > else > nextRecordMetadata.chain(futureRecordMetadata); > } > {code} > Before the error occurred we observed large amount of logs related to record > batches being split (caused by MESSAGE_TOO_LARGE error) on one of our topics > (logged by org.apache.kafka.clients.producer.internals.Sender): > {code:java} > [Producer clientId=producer-1] Got error produce response in correlation id > 158621342 on topic-partition <topic name>, splitting and retrying (2147483647 > attempts left). Error: MESSAGE_TOO_LARGE > {code} > All logs had different correlation ids, but the same counters of attempts > left (2147483647), so it looked like they were related to different requests > and all of them were succeeding with no further retries. > We are using kafka-clients java library in version 2.0.0, the brokers are > 2.1.1. > Thanks in advance. -- This message was sent by Atlassian Jira (v8.3.2#803003)