Hi, Rajini, Thanks for the KIP. Looks good to me overall. Just a couple of minor comments.
1. "So if a connection attempt is stuck for over a second, other nodes that are in reconnection backoff state will no longer be in backoff and as a result, rebootstrap will never be attempted." Not sure that I understand this. Why would a stuck connection to one node affect the reconnection backoff state to other nodes? 2. metadata.recovery.strategy and metadata.recovery.rebootstrap.trigger.ms: Should we add them to share consumer and kstreams too or leave that for future KIPs? Jun On Mon, Nov 4, 2024 at 7:38 AM Rajini Sivaram <rajinisiva...@gmail.com> wrote: > Hi Apoorv, > > Thanks for the review. Good idea to prefix with metadata.recovery, I have > renamed the config to metadata.recovery.rebootstrap.trigger.ms as you > suggested. > > Regards, > > Rajini > > On Mon, Nov 4, 2024 at 2:43 PM Apoorv Mittal <apoorvmitta...@gmail.com> > wrote: > > > Hi Rajni, > > Thanks for the KIP. It would be a good addition. > > > > If I understand it correctly then the client will trigger the > re-bootstrap > > process when the client is unable to obtain metadata in a defined period > of > > time. The proposed configuration "rebootstrap.timeout.ms" seems to me > more > > of a timeout for the re-bootstrap process, rather than the time to > trigger > > the re-bootstrap process. > > > > Do you think "metadata.recovery.rebootstrap.trigger.ms" would be a > better > > name? > > > > - > > > > *metadata.recovery.rebootstrap.trigger.ms > > <http://metadata.recovery.rebootstrap.trigger.ms>:* This is a clear > and > > concise option. It directly conveys the purpose of the configuration, > > which > > is to define the timeout for triggering a re-bootstrap when metadata > > cannot > > be obtained. It specifies not only the trigger but also the recovery > > strategy (re-bootstrap) associated with it. > > > > Additionally, the other option I was thinking of was " > > metadata.rebootstrap.trigger.ms". > > > > Regards, > > Apoorv Mittal > > > > > > On Mon, Nov 4, 2024 at 10:17 AM Rajini Sivaram <rajinisiva...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > Any other feedback or suggestions? If there are no concerns, I will > start > > > the vote tomorrow. > > > > > > Thank you, > > > > > > Rajini > > > > > > > > > On Wed, Oct 30, 2024 at 4:27 PM Rajini Sivaram < > rajinisiva...@gmail.com> > > > wrote: > > > > > > > Hi Andrew, > > > > > > > > Thanks for reviewing the KIP. > > > > > > > > AS1: Updated to KIP-559. Thanks for pointing that out. > > > > AS2: Yes, that is correct. The error code is a performance > optimization > > > to > > > > avoid unavailability for 5 minutes when a proxy may be able to tell > the > > > > client to rebootstrap immediately. > > > > > > > > Regards, > > > > > > > > Rajini > > > > > > > > > > > > > > > > On Wed, Oct 30, 2024 at 2:51 PM Andrew Schofield < > > > > andrew_schofield_j...@outlook.com> wrote: > > > > > > > >> Hi Rajini, > > > >> Thanks for the KIP. It looks like a useful improvement for > > > >> rebootstrapping. > > > >> I really like using the new major release as a way to change the > > default > > > >> so > > > >> rebootstrapping can be taken for granted in the future. > > > >> > > > >> A couple of comments. > > > >> AS1: (nit) KIP-559, not KIP-599, was the proxy-friendliness KIP. > > > >> AS2: I suppose that the new error code is really just a performance > > > >> optimisation for situations where the proxy can tell that > > > rebootstrapping > > > >> is required and it can avoid the clients waiting the full 5 minutes. > > Is > > > >> this > > > >> correct or is there more to it? > > > >> > > > >> Thanks, > > > >> Andrew > > > >> > > > >> ________________________________________ > > > >> From: Rajini Sivaram <rajinisiva...@gmail.com> > > > >> Sent: 28 October 2024 19:31 > > > >> To: dev <dev@kafka.apache.org> > > > >> Subject: [DISCUSS] KIP-1102: Enable clients to rebootstrap based on > > > >> timeout or error code > > > >> > > > >> Hi everyone, > > > >> > > > >> I would like to start discussion on KIP-1102 ( > > > >> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1102%3A+Enable+clients+to+rebootstrap+based+on+timeout+or+error+code > > > >> ). > > > >> This KIP extends KIP-899 by introducing a timeout configuration and > > > error > > > >> code that can be used to trigger rebootstrapping in clients. The KIP > > > also > > > >> proposes to enable rebootstrapping by default in 4.0.0. > > > >> > > > >> Thank you, > > > >> > > > >> Rajini > > > >> > > > > > > > > > >