[jira] [Assigned] (KAFKA-19963) Explain how to parallelize per topic with Kafka Streams

Nick Guo (Jira) Fri, 05 Dec 2025 22:03:58 -0800


     [ 
https://issues.apache.org/jira/browse/KAFKA-19963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nick Guo reassigned KAFKA-19963:
--------------------------------

    Assignee: Nick Guo

> Explain how to parallelize per topic with Kafka Streams
> -------------------------------------------------------
>
>                 Key: KAFKA-19963
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19963
>             Project: Kafka
>          Issue Type: Improvement
>          Components: docs, streams
>            Reporter: Matthias J. Sax
>            Assignee: Nick Guo
>            Priority: Major
>
> We regularly get the question, how one can break a KS program into more 
> tasks, for better parallelization. The pattern is usually something like this:
> {code:java}
> KStream input = builder.stream(<list-of-topics--or--pattern>);
> KStream result = input.filter(...).map(...); // or any other logic
> result.to("output-topic");{code}
> The above program reads from multiple topics, but creates a single 
> sub-topology, and thus, the maximum number of partitions across all input 
> topics is the number of task we get. However, there is no reason to funnel 
> the data of all partitions-X across all topics through a single task X.
> To break up the program, one can rewrite the topology to create multiple 
> sub-topologies, allowing for independent tasks per topic:
> {code:java}
> List topics = <list-of-topics>
> for (String topic : topics) {
>     KStream input = builder.stream(topic);
>     KStream result = input.filter(...).map(...); // or any other logic
>     result.to("output-topic");
> }{code}
> The above program creates an independent sub-topology per input topic, each 
> getting its own set of tasks.
> We should add this information to the docs, as this question comes up 
> regularly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (KAFKA-19963) Explain how to parallelize per topic with Kafka Streams

Reply via email to