Do we have any mechanism to control requests per second for a Kafka connect sink?

Yeikel Santana Mon, 04 Dec 2023 07:53:04 -0800

Hello everyone,



Is there any mechanism to force Kafka Connect to ingest at a given rate per 
second as opposed to tasks?



I am operating in a shared environment where the ingestion rate needs to be as 
low as possible (for example, 5 requests/second as an upper limit), and as far 
as I can tell, `tasks` are the main unit of work we can use. 



My current understanding is that a task will be blocked to process one batch, 
and it will continue to the next batch as soon as the previous request is 
completed. This should mean that if the target server can process the requests 
at a higher rate, then the sink will continue sending at that rate.



However, in my scenario, what I need is to send n requests per second and then 
sit idle until that time passes to avoid overloading the target server.



In this specific example, my best attempt to control the throughput was to 
configure it something like:





```json



"connector.class": 
"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",



"tasks.max": "1",



"max.retries": "10",



"retry.backoff.ms": "1000",



"max.buffered.records": "100",



"batch.size": "100",



"max.in.flight.requests": "1",



"flush.synchronously": "true",



```



Unfortunately, while that helps, it does not solve the inherent problem. I also 
understand that this is very specific to the given Sink Connector, but my 
question is more about a global overwrite that could be applied if any.



As an alternative, I also suppose that I could add a `Thread.sleep` call as an 
SMT, or to fork ElasticsearchSinkConnector to introduce something similar, but 
that does not sound like a good solution.





Thank you!

Do we have any mechanism to control requests per second for a Kafka connect sink?

Reply via email to