Hi Henry,

Thanks for getting the ball rolling on this. Funnily enough, I've just
started running into the same problem in the wild as I've been working on a
prototype Connect deployment for my company.

One thing that sticks out to me in the KIP is the call-out of 500 tasks.
That seems like a fairly low number to be hitting the 30-second timeout,
no? Less than 17 task configs per second.

Could we maybe think first about how to improve the performance of the
worker when producing to the config topic, instead of allowing for
increasingly-higher timeouts for publishing task configs? I've had some
success with tweaking linger time, batch size, etc. An alternative KIP
based on this approach could 1) introduce granular configuration for the
worker's internal producers (e.g., allow users to configure linger time for
the config topic but not for the status topic) and 2) introduce new
high-throughout defaults for the config topic (although we'd have to be
pretty conservative about setting something like linger time since it'd set
a lower bound on the time that most control plane operations would rate).

I'm worried about the availability of Connect's control plane if the
leader's herder tick thread is occupied for too long writing task configs,
and would like to address the root of this problem if possible.

Cheers,

Chris

On Thu, May 14, 2026, 03:13 Henry Haiying Cai via dev <[email protected]>
wrote:

> I would like to start discussing KIP-1339: Make KafkaConfigBackingStore
> read_write_total_timeout configurable
>
> KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1339%3A+Make+KafkaConfigBackingStore+read_write_total_timeout+configurable
>
> In Kafka Connect, the KafkaConfigBackingStore class is responsible for
> persisting connector and task configurations in a Kafka topic. Currently,
> the timeout for synchronous write operations and then reading back from the
> configuration topic is hardcoded via the READ_WRITE_TOTAL_TIMEOUT_MS
> constant, set to 30,000 milliseconds (30 seconds).
>
> While 30 seconds is sufficient for many environments, this hardcoded limit
> can be problematic in some specific scenarios.
>
> There were earlier discussions (
> https://lists.apache.org/thread/flpm1qrm7cwj6fqr1j8c0p8sbqysgv34) on
> extending the timeout value. This KIP proposes making this timeout value a
> configurable parameter.
>
> Any feedback is appreciated.
>
> Thanks and Regards
>
> Henry Cai
>

Reply via email to