[ https://issues.apache.org/jira/browse/KAFKA-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813573#comment-16813573 ]
Bill Bejeck commented on KAFKA-8201: ------------------------------------ Hi Anders Aagaard, I'm not sure this is a bug for Kafka Streams since this is known behavior (as you referenced in the above Jira) and is controlled by the broker. However, there is a workaround you can do. Kafka Streams users have control over the settings for repartition (internal) topic. When setting up your application you can adjust any of the settings you've listed above by using the StreamsConfig.topicPrefix method along with the relevant topic configuration value. For example: {noformat} final Properties props = new Properties(); props.put(StreamsConfig.topicPrefix(TopicConfig.CLEANUP_POLICY_CONFIG), TopicConfig.CLEANUP_POLICY_COMPACT); props.put(StreamsConfig.topicPrefix(TopicConfig.RETENTION_MS_CONFIG), "XXXXXX"); props.put(StreamsConfig.topicPrefix(TopicConfig.SEGMENT_MS_CONFIG), "XXXXXXX");{noformat} Just note that any settings set this way will apply to all Kafka Streams internal topics. HTH, Bill > Kafka streams repartitioning topic settings crashing multiple nodes > ------------------------------------------------------------------- > > Key: KAFKA-8201 > URL: https://issues.apache.org/jira/browse/KAFKA-8201 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 2.0.0 > Reporter: Anders Aagaard > Priority: Major > > We had an incident in a setup using kafka streams version 2.0.0 and kafka > version 2.0.0 protocol version 2.0-IV1. The reason for it is a combination of > kafka streams defaults and a bug in kafka. > Info about the setup: Streams application reading a log compacted input > topic, and performing a groupby operation requiring repartitioning. > Kafka streams automatically creates a repartitioning topic with 24 partitions > and the following options: > segment.bytes=52428800, retention.ms=9223372036854775807, > segment.index.bytes=52428800, cleanup.policy=delete, segment.ms=600000. > > This should mean we roll out a new segment when the active one reaches 50mb > or is older than 10 mniutes. However, the different timestamps coming into > the topic due to log compaction (sometimes varying in multiple days) means > the server will see a message which is older than segments.ms and > automatically trigger a new segment roll out. This causes a segment > explosion. Where new segments are continuously rolled out. > There seems to be a bug report for this server side here : > https://issues.apache.org/jira/browse/KAFKA-4336. > This effectively took down several nodes and a broker in our cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)