Hi Rapeepat,

1. The parallelism of Kafka Streams does not only depend on the number
of partitions of the input topic. It also depends on the structure of
your topology. Your example topology  topicA => transform1 => topicB
=> transform2 => topicC would be subdivided in two subtopologies:
- subtopology 1: topicA => transform1 => topicB
- subtopology 2: topicB => transform2 => topicC
For each combination of a subtopology and partition a task is created.
Tasks are distributed across Kafka Streams instances, more precisely
across stream threads on the instances (see also
https://kafka.apache.org/25/documentation/streams/architecture).

If we assume one partition, your topology would result in two tasks
and assuming that you have one stream thread per instance your
topology should run on two Kafka Streams instances.

2. Yes, your example would result in three subtopologies and they
should run on three Kafka Streams  instances with a stream thread
each.

One drawback of having multiple pipelines in the same Kafka Streams
application is that you cannot configure and scale them independently.

Best,
Bruno

On Fri, May 15, 2020 at 1:46 PM Rapeepat (Lookmoo) Sriwichai
<rapee...@central.tech> wrote:
>
> Dear Kafka,
>
> Hi there. I have a question about Kafka Stream parallelism.
> I know that Kafka Stream parallelism is based on consumer group.
> Like, if you have 3 partitions source topic you can have maximum 3 consumer 
> instances (or 3 kafka stream instances) at max that will work concurrently.
> I have 2 scenarios about it and would like to know how it will work?
>
>   1.  Assume I have a stream pipeline that read from topicA process save to 
> topicB read it back (using through method) process with another logic and 
> save to topicC
> topicA => transform1 => topicB => transform2 => topicC
> Supposed that all topics have only 1 partition(for simplicity), Can I have 2 
> instances of Kafka Stream and expected each one of them to do the transform1 
> and transform2 balancely ?
>   2.  Is it a good idea to have multiple streaming pipelines in same Kafka 
> Stream application ?
> Like
> topicA1 =>  transformA => topicA2
> topicB1 =>  transformB => topicB2
> topicC1 =>  transformC => topicC2
> and like previous one. assume each topic have only 1 partition, can I have 3 
> Kafka Stream instances and expect each one will take 1 task of each stream 
> pipeline ?
>
>
> Regards,
> Rapeepat
>

Reply via email to