[jira] [Resolved] (KAFKA-6037) Make Sub-topology Parallelism Tunable

Matthias J. Sax (Jira) Thu, 09 Apr 2020 16:56:45 -0700


     [ 
https://issues.apache.org/jira/browse/KAFKA-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Matthias J. Sax resolved KAFKA-6037.
------------------------------------
    Resolution: Fixed

Closing is as "fixed by KAFKA-8611".

> Make Sub-topology Parallelism Tunable
> -------------------------------------
>
>                 Key: KAFKA-6037
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6037
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Levani Kokhreidze
>            Priority: Major
>              Labels: kip
>             Fix For: 2.6.0
>
>
> Today the downstream sub-topology's parallelism (aka the number of tasks) are 
> purely dependent on the upstream sub-topology's parallelism, which ultimately 
> depends on the source topic's num. partitions. However this does not work 
> perfectly with dynamic scaling scenarios.
> Imagine if your have a simple aggregation application, it would have two 
> sub-topologies cut by the repartition topic, the first sub-topology would be 
> very light as it reads from input topics and write to repartition topic based 
> on the agg-key; the second sub-topology would do the actual work with the agg 
> state store, etc, hence is heavy computational. Right now the first and 
> second topology will always have the same number of tasks as the repartition 
> topic num.partitions is defined to be the same as the source topic 
> num.partitions, so to scale up we have to increase the number of input topic 
> partitions.
> One way to improve on that, is to use a default large number for repartition 
> topics and also allow users to override it (either through DSL code, or 
> through config). Doing this different sub-topologies would have different 
> number of tasks, i.e. parallelism units. In addition, users may also use this 
> config to "hint" the DSL translator to NOT create the repartition topics 
> (i.e. to not break up the sub-topologies) when she has the knowledge of the 
> data format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-6037) Make Sub-topology Parallelism Tunable

Reply via email to