Thanks, Rajini. Yes, it's there for AdminClientConfig. I missed it. Jun
On Thu, Nov 7, 2024 at 11:56 AM Rajini Sivaram <rajinisiva...@gmail.com> wrote: > Thanks Jun! > > Looks like the configs are not ordered alphabetically, but > metadata.recovery.strategy is included here: > > https://kafka.apache.org/documentation/#adminclientconfigs_metadata.recovery.strategy > . > > Regards, > > Rajini > > > On Thu, Nov 7, 2024 at 6:51 PM Jun Rao <j...@confluent.io.invalid> wrote: > > > Hi, Rajini, > > > > Thanks for the explanation. The KIP looks good to me now. > > > > Also, it seems that metadata.recovery.strategy is missing for > > AdminClientConfigs in > > https://kafka.apache.org/documentation/#adminclientconfigs. Not sure why > > this is happening since it's autogenerated. > > > > Jun > > > > On Thu, Nov 7, 2024 at 12:56 AM Rajini Sivaram <rajinisiva...@gmail.com> > > wrote: > > > > > Hi Jun, > > > > > > Thanks for reviewing the KIP. > > > > > > 1) When we need metadata, leastLoadedNode() attempts connection to > node1 > > > and if that fails, node1 is in reconnection backoff for a second. So > > > leastLoadedNode() attempts connection to node2. If the connection > attempt > > > to node2 takes 5 seconds and fails, we try leastLoadedNode again. This > > > time, node1 is no longer in reconnection backoff because its backoff > > period > > > of one second has elapsed. With the metadata recovery strategy in > > KIP-899, > > > all nodes have to be unavailable to trigger rebootstrap. Since node1 is > > no > > > longer in backoff, we will attempt node1 again rather than rebootstrap. > > > > > > 2) Good point. Since we are changing the default rebootstrap strategy, > it > > > makes sense to include share consumer and KStreams as well in this KIP. > > > Looks like we have WorkerConfig in Connect too. Will add them to the > KIP. > > > > > > Regards, > > > > > > Rajini > > > > > > On Thu, Nov 7, 2024 at 1:32 AM Jun Rao <j...@confluent.io.invalid> > wrote: > > > > > > > Hi, Rajini, > > > > > > > > Thanks for the KIP. Looks good to me overall. Just a couple of minor > > > > comments. > > > > > > > > 1. "So if a connection attempt is stuck for over a second, other > nodes > > > that > > > > are in reconnection backoff state will no longer be in backoff and > as a > > > > result, rebootstrap will never be attempted." Not sure that I > > understand > > > > this. Why would a stuck connection to one node affect the > reconnection > > > > backoff state to other nodes? > > > > > > > > 2. metadata.recovery.strategy and > > > metadata.recovery.rebootstrap.trigger.ms > > > > : > > > > Should we add them to share consumer and kstreams too or leave that > for > > > > future KIPs? > > > > > > > > Jun > > > > > > > > On Mon, Nov 4, 2024 at 7:38 AM Rajini Sivaram < > rajinisiva...@gmail.com > > > > > > > wrote: > > > > > > > > > Hi Apoorv, > > > > > > > > > > Thanks for the review. Good idea to prefix with > metadata.recovery, I > > > > have > > > > > renamed the config to metadata.recovery.rebootstrap.trigger.ms as > > you > > > > > suggested. > > > > > > > > > > Regards, > > > > > > > > > > Rajini > > > > > > > > > > On Mon, Nov 4, 2024 at 2:43 PM Apoorv Mittal < > > apoorvmitta...@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > Hi Rajni, > > > > > > Thanks for the KIP. It would be a good addition. > > > > > > > > > > > > If I understand it correctly then the client will trigger the > > > > > re-bootstrap > > > > > > process when the client is unable to obtain metadata in a defined > > > > period > > > > > of > > > > > > time. The proposed configuration "rebootstrap.timeout.ms" seems > to > > > me > > > > > more > > > > > > of a timeout for the re-bootstrap process, rather than the time > to > > > > > trigger > > > > > > the re-bootstrap process. > > > > > > > > > > > > Do you think "metadata.recovery.rebootstrap.trigger.ms" would > be a > > > > > better > > > > > > name? > > > > > > > > > > > > - > > > > > > > > > > > > *metadata.recovery.rebootstrap.trigger.ms > > > > > > <http://metadata.recovery.rebootstrap.trigger.ms>:* This is a > > > clear > > > > > and > > > > > > concise option. It directly conveys the purpose of the > > > > configuration, > > > > > > which > > > > > > is to define the timeout for triggering a re-bootstrap when > > > metadata > > > > > > cannot > > > > > > be obtained. It specifies not only the trigger but also the > > > recovery > > > > > > strategy (re-bootstrap) associated with it. > > > > > > > > > > > > Additionally, the other option I was thinking of was " > > > > > > metadata.rebootstrap.trigger.ms". > > > > > > > > > > > > Regards, > > > > > > Apoorv Mittal > > > > > > > > > > > > > > > > > > On Mon, Nov 4, 2024 at 10:17 AM Rajini Sivaram < > > > > rajinisiva...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > Any other feedback or suggestions? If there are no concerns, I > > will > > > > > start > > > > > > > the vote tomorrow. > > > > > > > > > > > > > > Thank you, > > > > > > > > > > > > > > Rajini > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 30, 2024 at 4:27 PM Rajini Sivaram < > > > > > rajinisiva...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Andrew, > > > > > > > > > > > > > > > > Thanks for reviewing the KIP. > > > > > > > > > > > > > > > > AS1: Updated to KIP-559. Thanks for pointing that out. > > > > > > > > AS2: Yes, that is correct. The error code is a performance > > > > > optimization > > > > > > > to > > > > > > > > avoid unavailability for 5 minutes when a proxy may be able > to > > > tell > > > > > the > > > > > > > > client to rebootstrap immediately. > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > Rajini > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 30, 2024 at 2:51 PM Andrew Schofield < > > > > > > > > andrew_schofield_j...@outlook.com> wrote: > > > > > > > > > > > > > > > >> Hi Rajini, > > > > > > > >> Thanks for the KIP. It looks like a useful improvement for > > > > > > > >> rebootstrapping. > > > > > > > >> I really like using the new major release as a way to change > > the > > > > > > default > > > > > > > >> so > > > > > > > >> rebootstrapping can be taken for granted in the future. > > > > > > > >> > > > > > > > >> A couple of comments. > > > > > > > >> AS1: (nit) KIP-559, not KIP-599, was the proxy-friendliness > > KIP. > > > > > > > >> AS2: I suppose that the new error code is really just a > > > > performance > > > > > > > >> optimisation for situations where the proxy can tell that > > > > > > > rebootstrapping > > > > > > > >> is required and it can avoid the clients waiting the full 5 > > > > minutes. > > > > > > Is > > > > > > > >> this > > > > > > > >> correct or is there more to it? > > > > > > > >> > > > > > > > >> Thanks, > > > > > > > >> Andrew > > > > > > > >> > > > > > > > >> ________________________________________ > > > > > > > >> From: Rajini Sivaram <rajinisiva...@gmail.com> > > > > > > > >> Sent: 28 October 2024 19:31 > > > > > > > >> To: dev <dev@kafka.apache.org> > > > > > > > >> Subject: [DISCUSS] KIP-1102: Enable clients to rebootstrap > > based > > > > on > > > > > > > >> timeout or error code > > > > > > > >> > > > > > > > >> Hi everyone, > > > > > > > >> > > > > > > > >> I would like to start discussion on KIP-1102 ( > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1102%3A+Enable+clients+to+rebootstrap+based+on+timeout+or+error+code > > > > > > > >> ). > > > > > > > >> This KIP extends KIP-899 by introducing a timeout > > configuration > > > > and > > > > > > > error > > > > > > > >> code that can be used to trigger rebootstrapping in clients. > > The > > > > KIP > > > > > > > also > > > > > > > >> proposes to enable rebootstrapping by default in 4.0.0. > > > > > > > >> > > > > > > > >> Thank you, > > > > > > > >> > > > > > > > >> Rajini > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >