Hi Marisa,
I am going to be running it in a Kubernetes cluster on Azure Kubernetes
Service using the set up scripts available in my github repo
https://github.com/izzyacademy/kafka-in-a-box
https://youtu.be/TDw3tDAiBBM
I will review the recommendations from the EventSizer.io tool as well to
make
Hi Israel,
Great job! It looks great and promising. I really like your YouTube channel
and the way you present the material. A couple of things that you might
want to consider for your benchmarks experiments:
1) What machine are you going to use? Is it a fast machine with enough cpu
cores? I woul
Marisa,
I have kicked off the video series on performance optimization for the
Kafka setup.
I will be working on the various configurations for latency, throughput,
availability and durability.
https://youtu.be/aPlbG349cXg
The first ones will be on latency and throughput which is what you are
i
Hi Alex,
> Furthermore, setting up a localhost pub/sub demo on a single machine
(your laptop?) is so far removed from a real-world scenario I can't imagine
how any numbers derived from that would be useful.
I can't imagine either. That's why I'm planning to run this on a lab Linux
machine with 8
Wow, that's awesome! I wasn't expecting that. I truly appreciate your help
and professionalism.
> Let me find some time soon and I will do a video on that scenario
optimized primarily for low latency and throughput. I will also compare how
this performs when adjusted for durability and high availa
Marisa, you might consider engaging someone at Confluent, maybe they can
give you some case studies or whitepapers from similar use-cases in the
financial industry. (and yes, Kafka is used in the financial industry) . A
client asking you to "prove that Kafka performs/scales" seems like an
unusual
Thanks for your response Marisa.
This has been a very interesting discussion and I appreciate it.
It is a bit of a challenge in the sense that I wish I had a demo ready to
go with similar use case and expectations to easily explain what I have
been trying to convey
I am always ready for a chall
Hi Israel,
> You can achieve any performance benchmark you are willing to pay for.
Thanks for your email. Allow me to respectfully disagree. I believe that
some systems are better than others when it comes to performance. The idea
that I can just take a slow system, multiply by 1 million, and the
Marisa,
I do not agree with your assessment. There are several factors that could
influence your performance numbers even with localhost. Your project should
be configured based on your own needs.
Your throughput could go up or lower depending on how you are configured
based on what is important
Hi Joris,
Thank you so much, friend!
> I appreciate that setting up everything on localhost will be easier and
lead to big numbers, but bear in mind that it's typically all the other
real-life stuff (remote connections, replication, at-least once, ...) that
causes massive slowdowns compared to lo
These tutorials - though quite a bit outdated - seem quite useful:
http://cloudurable.com/blog/kafka-tutorial-kafka-producer/index.html (and
the follow-ups).
Ends up being close to how I write this in Java, and tutorial 13 talks
about batching and acks etc, which you'll need in order to tune to max
Hi Joris,
Thank you so much. I plan to write a Java Consumer and a Java Producer, for
my benchmark. Do you recommend an example that I can use as a reference to
write my basic Java producer and simple Java consumer? I'll for sure share
the through number I get with the community. Maybe even write
I'd just follow the instructions in https://kafka.apache.org/quickstart to
set up Kafka and Zookeeper on a single node, by running the Java processes
directly. Or can run in Docker.
For the producer and consumer I'd personally use Python, as it's the
easiest to get going. You may want to look at
h
Hi Joris,
I've spoken to him. His answers are below:
On Thu, Jan 6, 2022 at 1:37 PM Joris Peeters
wrote:
> There's a few unknown parameters here that might influence the answer,
> though. From the top of my head, at least
> - How much replication of the data is needed (for high availability),
Hi Okada,
Thanks for your reply. Finally I see some numbers! I love numbers :)
I've shown your email to my boss (I hope he will hire me to do this
project) and he said the following:
"I would like to see this 833k/sec number for myself. Am I asking too much?
:) Can you set up a very basic and si
There's a few unknown parameters here that might influence the answer,
though. From the top of my head, at least
- How much replication of the data is needed (for high availability), and
how many acks for the producer? (If fire-and-forget it can be faster, if
need to replicate and ack from 3 broker
Hi Israel,
Your email is great, but I'm afraid to forward it to my customer because it
doesn't answer his question.
I'm hoping that other members from this list will be able to give me a more
NUMERIC answer, let's wait to see.
Just to give you some follow up on your answer, when you say:
> 30 p
Hi, Marisa.
Kafka is well-designed to make full use of system resources, so I think
calculating based on machine's spec is a good start.
Let's say we have servers with 10Gbps full-duplex NIC.
Also, let's say we set the topic's replication factor to 3 (so the cluster
will have minimum 3 servers),
Hi Marisa
I think there may be some confusion about the throughput for each partition
and I want to explain briefly using some analogies
Using transportation for example if we were to pick an airline or
ridesharing organization to describe the volume of customers they can
support per day we would
Cheers from NYC!
I'm trying to give a performance number to a potential client (from the
financial market) who asked me the following question:
*"If I have a Kafka system setup in the best way possible for performance,
what is an approximate number that I can have in mind for the throughput of
th
20 matches
Mail list logo