On 2019/06/14 18:25:37, "Matthias J. Sax" <matth...@confluent.io> wrote:
> >> We are using Kafka streams 2.1.1 and leveraging DSL APIs for doing
> >> stateful processing (aggregation). As per our use case we ended up
> >> creating ~400 sub topologies.
> >> Our observation is that based on the number of partitions in the
> >> input/source topic, corresponding stream tasks get created for each of
> >> these sub topologies. Wanted to know if having so many sub topologies is a
> >> bad idea OR is there a practical limit on number of sub topologies?
>
>
> The only concern might be the consumer group metadata.
>
> Kafka Streams encodes quite some additional information compared to
> plain consumers, and there was issues with large number of partitions in
> the past. It's currently work in progress to address this problem.
>
> https://issues.apache.org/jira/browse/KAFKA-7149
>
>
Thanks Matthias! Will keep that in mind.
> >> Other observation is that due to more number of sub topologies, Kafka
> >> cluster takes more time to create corresponding internal topics & respond
> >> back to kafka streams application resulting in endless "(Re-)joining
> >> group" loop.
> >>
> >> We have overcome this by increasing the request & session time out values.
> >> But wanted to be sure if it will have any other side effects.
>
> That is expected and increasing the timeouts accordingly is the right
> approach.
>
>
>
> -Matthias
>
> On 6/13/19 9:56 AM, emailtokir...@gmail.com wrote:
> > Hi,
> >
> > We are using Kafka streams 2.1.1 and leveraging DSL APIs for doing stateful
> > processing (aggregation). As per our use case we ended up creating ~400 sub
> > topologies.
> > Our observation is that based on the number of partitions in the
> > input/source topic, corresponding stream tasks get created for each of
> > these sub topologies. Wanted to know if having so many sub topologies is a
> > bad idea OR is there a practical limit on number of sub topologies?
> >
> > Other observation is that due to more number of sub topologies, Kafka
> > cluster takes more time to create corresponding internal topics & respond
> > back to kafka streams application resulting in endless "(Re-)joining group"
> > loop.
> >
> > We have overcome this by increasing the request & session time out values.
> > But wanted to be sure if it will have any other side effects.
> >
>
>