Hi Skander,

Thanks for the KIP. Here are some of my thoughts on it ...

I think using a poller instead of the WatchService is a good choice. In the
previous KIP (KIP-1119), this was my main concern about why it would not
work.

However, are you sure that Files.getLastModifiedTime() will work on
Kubernetes with something like a mounted ConfigMap or Secret? The file
itself is a symlink, and its dates do not change when a Secret is updated.
At least when observed with something like bash's stat command. Only the
dates of the file that the symlink points to change. So, out of my head,
I'm not sure which timestamp Java would give you (I haven't tried it, to be
honest - I'm just wondering if you did and if it really works). If the
timestamp doesn't work, maybe one can just read the content of the file and
store some checksum to compare it with in the next check?

The other part of my comments in KIP-1119 was more about the usability for
something like Strimzi. I do not think the debounce interval really solves
the issue for us. With Kafka, you have a distributed system with:
* Multiple controllers
* Multiple brokers
* Additional components (e.g., an Operator, Cruise Control, etc.)

So when I need to, for example, roll out a new Certificate Authority, and I
use mTLS authentication, I have to:
* First, roll out the trust to the new CA to all the components
* Only once all components trust the new CA, I can start rolling out the
new server/user certificates
* Once the new user and server certificates are used by all components, I
can remove the old CA

But the debounce interval works only locally within a single Kafka node. So
while it allows me to safely reload the certificates within the node, which
is good, it does not help me with the understanding of the state on the
other nodes. To be able to orchestrate the whole system, I need a way to
find out if it has been reloaded in order to proceed with the next steps.
For example, open a TCP connection and sniff the actual TLS configuration.
But that is pretty ugly, and leaves a mess in the logs and so on.

Don't get me wrong. I think this is a useful KIP, and I guess that in many
cases - especially when running things manually - it would work fine. It
would also work fine for reloading server certificates only, without an
mTLS. Which is a useful feature as well, with CAs such as Let's Encrypt
shortening the validity period of their server certificates.

But for an automated solution like Strimzi, the main missing feature for
the hot-reloading of certificates is not about the auto-reload being done
by Kafka. It is an API that would tell us what is the current state of the
system in order to orchestrate more complicated things.

Thanks & Regards
Jakub

On Sat, Feb 21, 2026 at 3:58 PM Skander Soltane <[email protected]>
wrote:

> Hi all,
>
> I'd like to start a discussion on a new KIP for SSL hot reload on the
> client side.
>
> You can find the KIP here :
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1288%3A+SSL+Hot+Reload+for+Kafka+Clients
>
> I also drafted a PR implementing the KIP as I imagined it:
> https://github.com/apache/kafka/pull/21488
>
> I'd love to hear your thoughts, especially on the polling approach vs
> WatchService, the debounce mechanism, and whether the registry design makes
> sense to you.
>
> Than you!
> Skander
>

Reply via email to