[ 
https://issues.apache.org/jira/browse/KAFKA-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567441#comment-16567441
 ] 

ASF GitHub Bot commented on KAFKA-6761:
---------------------------------------

bbejeck opened a new pull request #5451: KAFKA-6761: Reduce streams footprint 
part IV add optimization
URL: https://github.com/apache/kafka/pull/5451
 
 
   This PR adds the optimization of eliminating multiple repartition topics 
when the `KStream` resulting from a key-changing operation executes other 
methods using the new key and reduces the repartition topics to one.
   
   Note that this PR leaves in place the optimization for re-using a source 
topic as a changelog topic for source `KTable` instances.  I'll have another 
follow-up PR to move the source topic optimization to a method within 
`InternalStreamsBuilder` so it can be performed in the same area of the code.
   
   Additionally, the current value of `StreamsConfig.OPTIMIZE` is `all` and 
we'll need to have another KIP to change the value to `2.1`.   
   
   An integration test `RepartitionOptimizingIntegrationTest` which asserts the 
same results for an optimized topology with one repartition topic as the 
un-optimized version with four repartition topics.
   More tests will be added, but I wanted to get reviews on the approach now.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Reduce Kafka Streams Footprint
> ------------------------------
>
>                 Key: KAFKA-6761
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6761
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Bill Bejeck
>            Assignee: Bill Bejeck
>            Priority: Major
>             Fix For: 2.1.0
>
>
> The persistent storage footprint of a Kafka Streams application contains the 
> following aspects:
>  # The internal topics created on the Kafka cluster side.
>  # The materialized state stores on the Kafka Streams application instances 
> side.
> There have been some questions about reducing these footprints, especially 
> since many of them are not necessary. For example, there are redundant 
> internal topics, as well as unnecessary state stores that takes up space but 
> also affect performance. When people are pushing Streams to production with 
> high traffic, this issue would be more common and severe. Reducing the 
> footprint of Streams have clear benefits for reducing resource utilization of 
> Kafka Streams applications, and also not creating pressure on broker's 
> capacities.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to