Re: Kafka Scaling Ideas

2020-12-22 Thread Haruki Okada
Hm, it's an optimization for "first layer", so if the bottleneck is in "second layer" (i.e. DB write) as you mentioned, it shouldn't make much difference I think. 2020年12月22日(火) 16:02 Yana K : > I thought about it but then we don't have much time - will it optimize > performance? > > On Mon, Dec

Re: Kafka Scaling Ideas

2020-12-21 Thread Yana K
I thought about it but then we don't have much time - will it optimize performance? On Mon, Dec 21, 2020 at 4:16 PM Haruki Okada wrote: > About "first layer" right? > Then it's better to make sure that not get() the result of Producer#send() > for each message, because in that way, it spoils the

Re: Kafka Scaling Ideas

2020-12-21 Thread Haruki Okada
About "first layer" right? Then it's better to make sure that not get() the result of Producer#send() for each message, because in that way, it spoils the ability of producer-batching. Kafka producer batches messages by default and it's very efficient, so if you produce in async way, it rarely beco

Re: Kafka Scaling Ideas

2020-12-21 Thread Yana K
Thanks! Also are there any producer optimizations anyone can think of in this scenario? On Mon, Dec 21, 2020 at 8:58 AM Joris Peeters wrote: > I'd probably just do it by experiment for your concrete data. > > Maybe generate a few million synthetic data rows, and for-each-batch insert > them i

Re: Kafka Scaling Ideas

2020-12-21 Thread Joris Peeters
I'd probably just do it by experiment for your concrete data. Maybe generate a few million synthetic data rows, and for-each-batch insert them into a dev DB, with an outer grid search over various candidate batch sizes. You're looking to optimise for flat-out rows/s, so whichever batch size wins (

Re: Kafka Scaling Ideas

2020-12-21 Thread Yana K
Thanks Haruki and Joris. Haruki: Thanks for the detailed calculations. Really appreciate it. What tool/lib is used to load test kafka? So we've one consumer group and running 7 instances of the application - that should be good enough - correct? Joris: Great point. DB insert is a bottleneck (and

Re: Kafka Scaling Ideas

2020-12-21 Thread Joris Peeters
Do you know why your consumers are so slow? 12E6msg/hour is msg/s, which is not very high from a Kafka point-of-view. As you're doing database inserts, I suspect that is where the bottleneck lies. If, for example, you're doing a single-row insert in a SQL DB for every message then this would i

Re: Kafka Scaling Ideas

2020-12-21 Thread Haruki Okada
About load test: I think it'd be better to monitor per-message process latency and estimate required partition count based on it because it determines the max throughput per single partition. - Say you have to process 12 million messages/hour = messages/sec . - If you have 7 partitions (thus 7

Re: Kafka Scaling Ideas

2020-12-20 Thread Yana K
So as the next step I see to increase the partition of the 2nd topic - do I increase the instances of the consumer from that or keep it at 7? Anything else (besides researching those libs)? Are there any good tools for load testing kafka? On Sun, Dec 20, 2020 at 7:23 PM Haruki Okada wrote: > It

Re: Kafka Scaling Ideas

2020-12-20 Thread Haruki Okada
It depends on how you manually commit offsets. Auto-commit does commits offsets in async manner basically, so as long as you do manual-commit in the same way, there should be no much difference. And, generally offset-commit mode doesn't make much difference in performance regardless manual/auto o

Re: Kafka Scaling Ideas

2020-12-20 Thread Yana K
Thank you so much Marina and Haruka. Marina's response: - When you say " if you are sure there is no room for perf optimization of the processing itself :" - do you mean code level optimizations? Can you please explain? - On the second topic you say " I'd say at least 40" - is this based on 12 mil

Re: Kafka Scaling Ideas

2020-12-19 Thread Haruki Okada
Hi. Yeah, Spring-Kafka does processing messages sequentially, so the consumer throughput would be capped by database latency per single process. One possible solution is creating an intermediate topic (or altering source topic) with much more partitions as Marina suggested. I'd like to suggest an

Re: Kafka Scaling Ideas

2020-12-19 Thread Marina Popova
The way I see it - you can only do a few things - if you are sure there is no room for perf optimization of the processing itself : 1. speed up your processing per consumer thread: which you already tried by splitting your logic into a 2-step pipeline instead of 1-step, and delegating the work o

Kafka Scaling Ideas

2020-12-19 Thread Yana K
Hi I am new to the Kafka world and running into this scale problem. I thought of reaching out to the community if someone can help. So the problem is I am trying to consume from a Kafka topic that can have a peak of 12 million messages/hour. That topic is not under my control - it has 7 partitions

Re: kafka scaling

2019-04-03 Thread Evelyn Bayes
Hi Ramz, A good rule of thumb has been no more than 4,000 partitions per broker and no more than 100,000 in a cluster. This includes all replicas and it's related more to Kafka internals then it is resource usage so I strongly advise not pushing these limits. Otherwise, the usual reasons for sc

kafka scaling

2019-04-03 Thread Rammohan Vanteru
Hi users, On what basis should we scale kafka cluster what would be symptoms for scaling kafka. I have a 3 node kafka cluster upto how many max partitions a single broker or kafka cluster can support? If any article or knowledge share would be help on scaling kafka. Thanks, Ramz.

RE: Rebalancing issue while Kafka scaling

2016-06-01 Thread Thakrar, Jayesh
--- From: Hafsa Asif [mailto:hafsa.a...@matchinguu.com] Sent: Wednesday, June 01, 2016 7:05 AM To: users@kafka.apache.org Cc: Spico Florin Subject: Re: Rebalancing issue while Kafka scaling Just for more info: If I have 10 servers in a cluster, so for the most tolerant cluster, do we need replication-

Re: Rebalancing issue while Kafka scaling

2016-06-01 Thread Hafsa Asif
eplicas, including the server > >> being > >>>> removed and you intend to rebalance after server removal). > >>>> > >>>> However, "automating" the rebalancing of topic partitions is not > >> trivial. > >>>> > >>&g

Re: Rebalancing issue while Kafka scaling

2016-06-01 Thread Ben Stopford
ions is not >> trivial. >>>> >>>> There is a KIP out there to help with the rebalancing , but lacks >> details >>>> - >>>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-6+-+New+reassignment+partition+logic+for+rebalancing &g

Re: Rebalancing issue while Kafka scaling

2016-06-01 Thread Hafsa Asif
pache.org/confluence/display/KAFKA/KIP-6+-+New+reassignment+partition+logic+for+rebalancing > >> My guess is due to its non-trivial nature AND the number of cases one > >> needs to take care of - e.g. scaling up by 5% v/s scaling up by 50% in > say, > >> a 20 node clus

Re: Rebalancing issue while Kafka scaling

2016-06-01 Thread Ben Stopford
the number of cases one >> needs to take care of - e.g. scaling up by 5% v/s scaling up by 50% in say, >> a 20 node cluster. >> Furthermore, to be really effective, one needs to be cognizant of the >> partition sizes, and with rack-awareness, the task becomes even more >

Re: Rebalancing issue while Kafka scaling

2016-06-01 Thread Hafsa Asif
> Furthermore, to be really effective, one needs to be cognizant of the > partition sizes, and with rack-awareness, the task becomes even more > involved. > > Regards, > Jayesh > > -Original Message- > From: Spico Florin [mailto:spicoflo...@gmail.com] > Sent

RE: Rebalancing issue while Kafka scaling

2016-05-31 Thread Thakrar, Jayesh
- From: Spico Florin [mailto:spicoflo...@gmail.com] Sent: Tuesday, May 31, 2016 9:44 AM To: users@kafka.apache.org Subject: Re: Rebalancing issue while Kafka scaling Hi! What version of Kafka you are using? What do you mean by "Kafka needs rebalacing?" Rebalancing of what? Can you ple

Re: Rebalancing issue while Kafka scaling

2016-05-31 Thread Spico Florin
Hi! What version of Kafka you are using? What do you mean by "Kafka needs rebalacing?" Rebalancing of what? Can you please be more specific. Regards, Florin On Tue, May 31, 2016 at 4:58 PM, Hafsa Asif wrote: > Hello Folks, > > Today , my team members shows concern that whenever we increase

Rebalancing issue while Kafka scaling

2016-05-31 Thread Hafsa Asif
Hello Folks, Today , my team members shows concern that whenever we increase node in Kafka cluster, Kafka needs rebalancing. The rebalancing is sort of manual and not-good step whenever scaling happens. Second, if Kafka scales up then it cannot be scale down. Please provide us proper guidance over