Hi Sigalit,

First of all, did you ensure different source operator consumes different 
consumer id for the kafka source? Did each flink job share the same data or 
consumed the data independently?

Moreover, was your job behaves back pressured? It might need to break the 
chained operator to see whether the sink back-pressured the source to impact 
the throughput of source.

Last but not least, did your source already have 100% CPU usage, which means 
your source operator has already reached to its highest throughput.

Best
Yun Tang
________________________________
From: Sigalit Eliazov <e.siga...@gmail.com>
Sent: Thursday, April 7, 2022 19:12
To: user <user@flink.apache.org>
Subject: flink pipeline handles very small amount of messages in a second (only 
500)

hi all

I would appreciate some help to understand the pipeline behaviour...

We deployed a standalone flink cluster. The pipelines are deployed via the jm 
rest api.
We have 5 task managers with 1 slot each.

In total i am deploying 5 pipelines which mainly read from kafka, a simple 
object conversion and either write back to kafka or GCP pub/sub or save in the 
DB.

These jobs run "forever" and basically each task manager runs a specific job 
(this is how flink handled it).

 We have a test that sends to kafka 10k messages per second. but according to 
the metrics exposed by flink i see that the relevant job handles only 500 
messages per second.
I would expect all the 10K to be handled. I guess the setup is not correct.

The messages are in avro format
Currently we are not using checkpoints at all.
Any suggestions are welcome.

Thanks alot
Sigalit



Reply via email to