Hi Greg, > Applying this to a hypothetical extension of timeouts > to other endpoints: my concern was primarily about > whether this design applied everywhere felt natural or > awkward. In my opinion, a request parameter would > draw a lot of outsized attention in documentation, and > be the first parameter for a lot of the endpoints, while a > header could be specified as a more obscure global > modifier to all requests.
That's a fair point, I do agree that the choice of a request header here seems more appropriate now. > My chief concern was about the clarity of the API in the > future when both configurable timeouts and asynchronous > APIs exist. Will the configurable timeouts still make sense, > and will they be a feature that people will use? I do believe that there's a place for both of these to co-exist - there will always be use-cases that require config validation errors to be reported synchronously so that connector configurations can be corrected or iterated upon. I don't really see there being valid use cases for having a timeout of 20 or 60 minutes as you suggested, but I do think that the genuine edge cases that currently exist requiring a config validation timeout of greater than the current 90 seconds (2-3 minutes perhaps) deserve a way to be handled until we have an asynchronous config validation API (if we ever do go down that path). I've updated the KIP to reflect the change to use a request header rather than a query parameter to configure the timeout value. Thanks, Yash On Tue, Mar 14, 2023 at 2:46 AM Greg Harris <greg.har...@aiven.io.invalid> wrote: > Yash, > > 1. I think that Request-Timeout is a fine choice. I did not see any > standard header for a use-case like this, so we are free to choose a name > for ourselves. > > 2. While such proposals do not exist, I don't think that means that we > should exclude them from consideration. If we do, then we run the risk of > making future implementations more complicated, impossible, or awkward to > use. This is a balance of course, as we cannot let future features paralyze > us from making improvements today. We cannot hold ourselves responsible for > handling unknown unknowns, but we can at least try to plan around known > unknowns. > Applying this to a hypothetical extension of timeouts to other endpoints: > my concern was primarily about whether this design applied everywhere felt > natural or awkward. In my opinion, a request parameter would draw a lot of > outsized attention in documentation, and be the first parameter for a lot > of the endpoints, while a header could be specified as a more obscure > global modifier to all requests. > > 3. 4. Applying the same standard to a hypothetical asynchronous I don't > believe it makes async validation substantially more complicated, except > the "request timeout interrupts connector creation" interaction. I also do > not think that it would make it impossible to implement an async validation > scheme in the future, since the synchronization is already in place, the > timeout is just different. > My chief concern was about the clarity of the API in the future when both > configurable timeouts and asynchronous APIs exist. Will the configurable > timeouts still make sense, and will they be a feature that people will use? > Are we expecting users to hold a single REST request open for 5, 20, or 60 > minutes to accommodate a connector with an excessively long validation > step? Successfully validating a request requires the caller and the network > to remain stable for the entire duration of the request to have a chance of > success, and consumes (some small) amount of resources during that time. > And what if they decide they no longer want the validation result, and wish > to cancel the operation to free resources (both connect and external > system)? There is no standard mechanism to discard or cancel an HTTP > request (not even a connection loss). > > You have said that the default timeout for other endpoints is more than > satisfactory for the typical user, and I agree. If other operations > (deleting, restarting, pause/unpause, etc) had a substantial synchronous > component that would benefit from a configurable timeout, I would be more > inclined to support configurable timeouts (all at once, or validation first > and other operations later). > But as it stands, I don't think I see the value-add of configurable > timeouts once there is another solution which targets the validate call > duration specifically, and I'm concerned that configurable timeouts would > be made irrelevant in that case. > > Thanks, > Greg > > On Wed, Mar 8, 2023 at 7:21 AM Yash Mayya <yash.ma...@gmail.com> wrote: > > > Hi Greg, > > > > Thanks for the response! > > > > 1. Hm that's a fair point, and while there aren't any strict guidelines > or > > rules governing whether something should be a query parameter versus a > > request header, I agree that it might be more idiomatic for a request > > timeout value to be accepted via a custom request header. What do you > > think about using a header name like "X-Request-Timeout", or maybe simply > > "Request-Timeout" if we want to take into account > > https://www.rfc-editor.org/rfc/rfc6648? > > > > 2. 3. 4. Adding asynchronous validations via new endpoints / new request > > parameters on existing endpoints is definitely an interesting idea but > I'm > > not sure whether this KIP should take into consideration a hypothetical > > future proposal? If there were already such a concrete proposal, I would > > definitely agree that this KIP should take that into consideration - but > > that isn't the case today. Furthermore, I'm not sure I understand why the > > addition of request timeout configurability to the existing endpoints > would > > preclude the introduction of an asynchronous validation API in the > future? > > > > Thanks, > > Yash > > > > On Sat, Mar 4, 2023 at 1:13 AM Greg Harris <greg.har...@aiven.io.invalid > > > > wrote: > > > > > Yash, > > > > > > 1. > > > Currently the request parameters in the REST API are individual and > > pertain > > > to just one endpoint. > > > They also change the content of the query result, or change the action > > > taken on the cluster. > > > I think that a request timeout is a property of the HTTP request more > > than > > > it is a property of the cluster query or cluster action. > > > The two solutions have very similar tradeoffs, but I'm interested in > > > whether one is more idiomatic and more obvious to users. > > > > > > 2. > > > I understand that only these three endpoints are in need of increased > > > timeouts at this time due to long connector validations. > > > From another perspective, this change is making the API more irregular > > and > > > someone in the future might choose to make it more regular by > > standardizing > > > the configurable timeout functionality. > > > I wouldn't (in this KIP) dismiss someone's desire to configure other > > > timeouts in the future (in another KIP), and design them into a corner. > > > It is acceptable to limit the scope of this change to just the three > > > endpoints due to practical reasons, but I don't think that should > prevent > > > us from trying to ensure that this design fits in the "end goal" state > of > > > the Connect service. > > > > > > 3. 4. > > > I am not suggesting an incompatible change, as the current synchronous > > > behavior is still a useful API for certain situations. I think that it > is > > > possible to add asynchronous validations in a backwards compatible way, > > > using new endpoints or other new request parameters. > > > The interface could be designed such that users with connectors that > > exceed > > > the synchronous timeouts can utilize the asynchronous API. Tooling can > > use > > > the asynchronous API when it is available, and fall back to the > > synchronous > > > API when it is not. > > > I think that it also may be more in-line with the design of the rest of > > the > > > REST API, where nearly every other request is asynchronous. That's why > > > you're only targeting these three endpoints, they're the only ones > with a > > > synchronicity constraint. > > > Again, I'm not necessarily saying that you must implement such an > > > asynchronous validation scheme in this KIP, but we should consider if > > that > > > is a more extensible solution. If we decided to implement configurable > > > synchronous timeouts now, how would that complement an asynchronous API > > in > > > the future? > > > > > > On Thu, Mar 2, 2023 at 10:00 PM Yash Mayya <yash.ma...@gmail.com> > wrote: > > > > > > > Hi Greg, > > > > > > > > Thanks for taking a look! > > > > > > > > 1. I believe Chris suggested the use of a query parameter above as we > > > > already have precedents for using query parameters to configure per > > > request > > > > behavior in Kafka Connect (the restart connectors API and the get > > > > connectors API for instance). Also, the limited choice of endpoints > > > > targeted is intentional (see my reply to the next point). > > > > > > > > 2. I intentionally targeted just the three listed endpoints where > > > > synchronous connector config validations come into the picture. This > is > > > > because of the legitimate cases where config validation for specific > > > > connector plugins might exceed the default request timeout in edge > case > > > > scenarios (outlined in the KIP's motivation section). Other Connect > > REST > > > > endpoints shouldn't be taking longer than the default 90 second > request > > > > timeout; if they do so, it would either be indicative of a bug in the > > > > Connect framework or a cluster health issue - neither of which should > > be > > > > covered up by manually setting a longer request timeout. > > > > > > > > 3. 4. I think changing the config validation behavior would be a > > backward > > > > incompatible change and I wanted to avoid that in this particular > KIP. > > > > There are multiple programmatic UIs out there which rely on the > current > > > > synchronous config validation behavior and breaking the existing > > contract > > > > would definitely require a larger discussion. > > > > > > > > Thanks, > > > > Yash > > > > > > > > On Fri, Mar 3, 2023 at 12:04 AM Greg Harris > > <greg.har...@aiven.io.invalid > > > > > > > > wrote: > > > > > > > > > Hey Yash, > > > > > > > > > > Thanks for the KIP, and sorry for the late review. > > > > > > > > > > 1. Have you considered a HTTP header to provide the > > client-configurable > > > > > timeout? A header might more naturally extend to all of the other > > > > endpoints > > > > > in the future, rather than duplicating the query parameter across > > > > > endpoints. > > > > > > > > > > 2. I understand that this change is targeted at just long duration > > > > > Connector::validation calls, is that due to voluntary scope > > > constraints? > > > > > Implementing configurable timeouts for all endpoints in a uniform > way > > > > could > > > > > be desirable, even if the default timeout will work for nearly all > of > > > the > > > > > other calls. > > > > > > > > > > 3. Did you consider adding asynchronous validation as a user-facing > > > > > feature? I think that relying on the synchronous validation results > > in > > > a > > > > > single HTTP request is a bit of an anti-pattern, and one that we've > > > > > inherited from the original REST design. It seems useful when using > > the > > > > > REST API by hand, but seems to be a liability when used in > > environments > > > > > with an external management layer. > > > > > > > > > > 4. Perhaps rather than allowing synchronous calls to > > Connector:validate > > > > to > > > > > increase in duration, we should provide a way for connectors to > > surface > > > > > validation problems later in their lifecycle. Currently there is > the > > > > > ConnectorContext::raiseError that transitions the task to FAILED, > > > > perhaps a > > > > > similar API could asynchronously emit re-validation results. We've > > also > > > > had > > > > > a problem with long start() duration for the same reasons as > > > validate(). > > > > > > > > > > I understand if you want to keep this KIP as tightly focused as > > > possible, > > > > > but i'm worried that it is addressing the symptom and not the > > problem. > > > I > > > > > want to make sure that this change is impactful and isn't obsoleted > > by > > > a > > > > > later improvement. > > > > > > > > > > Thanks, > > > > > Greg > > > > > > > > > > > > > > > On Wed, Mar 1, 2023 at 5:07 AM Yash Mayya <yash.ma...@gmail.com> > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > Thanks for all the feedback and discussion. I've renamed the KIP > > > title > > > > to > > > > > > "Kafka Connect REST API timeout improvements" since we're > > > introducing a > > > > > > couple of improvements (cancelling create / update connector > > requests > > > > > when > > > > > > config validations timeout and avoiding double config validations > > in > > > > > > distributed mode) along with making the request timeouts > > > configurable. > > > > > The > > > > > > new KIP link is > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-882%3A+Kafka+Connect+REST+API+timeout+improvements > > > > > > and I've called for a vote for the KIP here - > > > > > > https://lists.apache.org/thread/mgx8lczx2f57pk7x3vh0nqk00s79grgp > . > > > > > > > > > > > > Thanks, > > > > > > Yash > > > > > > > > > > > > On Sat, Nov 5, 2022 at 11:42 PM Sagar <sagarmeansoc...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > Hey Yash, > > > > > > > > > > > > > > Thanks for the explanation. I think it should be fine to > delegate > > > the > > > > > > > validation directly to the leader. > > > > > > > > > > > > > > Thanks! > > > > > > > Sagar. > > > > > > > > > > > > > > On Sat, Nov 5, 2022 at 10:42 AM Yash Mayya < > yash.ma...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Sagar, > > > > > > > > > > > > > > > > Thanks for chiming in! > > > > > > > > > > > > > > > > > Having said that, why does the worker forward to the > > > > > > > > > leader? I am thinking if the worker can perform the > > validation > > > on > > > > > > it's > > > > > > > > own, > > > > > > > > > we could let it do the validation instead of forwarding > > > > everything > > > > > to > > > > > > > the > > > > > > > > > leader > > > > > > > > > > > > > > > > Only the leader is allowed to perform writes to the config > > topic, > > > > so > > > > > > any > > > > > > > > request that requires a config topic write ends up being > > > forwarded > > > > to > > > > > > the > > > > > > > > leader. The `POST /connectors` and `PUT > > > > > /connectors/{connector}/config` > > > > > > > > endpoints call `Herder::putConnectorConfig` which internally > > > does a > > > > > > > config > > > > > > > > validation first before initiating another herder request to > > > write > > > > to > > > > > > the > > > > > > > > config topic (in distributed mode). If we want to allow the > > first > > > > > > worker > > > > > > > to > > > > > > > > do the config validation, and then forward the request to the > > > > leader > > > > > > just > > > > > > > > for the write to the config topic, we'd either need something > > > like > > > > a > > > > > > skip > > > > > > > > validations query parameter on the endpoint like Chris talks > > > about > > > > > > above > > > > > > > or > > > > > > > > else a new internal only endpoint that just does a write to > the > > > > > config > > > > > > > > topic without any prior config validation. However, we agreed > > > that > > > > > this > > > > > > > > optimization doesn't really seem necessary for now and can be > > > done > > > > > > later > > > > > > > if > > > > > > > > deemed useful. I'd be happy to hear differing thoughts if > any, > > > > > however. > > > > > > > > > > > > > > > > > I think a bound is certainly needed but IMO it shouldn't go > > > > beyond > > > > > > > > > 10 mins considering this is just validation > > > > > > > > > > > > > > > > Yeah, I agree that this seems like a fair value - I've > elected > > to > > > > go > > > > > > > with a > > > > > > > > default value of 10 minutes for the proposed worker > > configuration > > > > > that > > > > > > > sets > > > > > > > > an upper bound for the timeout query parameter. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Yash > > > > > > > > > > > > > > > > On Sat, Nov 5, 2022 at 10:30 AM Yash Mayya < > > yash.ma...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Chris, > > > > > > > > > > > > > > > > > > Thanks again for your feedback. I think a worker > > configuration > > > > for > > > > > > the > > > > > > > > > upper bound makes sense - I initially thought we could > > hardcode > > > > it > > > > > > > (just > > > > > > > > > like the current request timeout is), but there's no reason > > to > > > > set > > > > > > > > another > > > > > > > > > artificial bound that isn't user configurable which is > > exactly > > > > what > > > > > > > we're > > > > > > > > > trying to change here in the first place. I've updated the > > KIP > > > > > based > > > > > > on > > > > > > > > all > > > > > > > > > our discussion above. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Yash > > > > > > > > > > > > > > > > > > On Thu, Nov 3, 2022 at 11:01 PM Sagar < > > > sagarmeansoc...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> Hey Yash, > > > > > > > > >> > > > > > > > > >> Thanks for the KIP! This looks like a useful feature. > > > > > > > > >> > > > > > > > > >> I think the discussion thread already has some great > points > > by > > > > > > Chris. > > > > > > > > Just > > > > > > > > >> a couple of points/clarifications=> > > > > > > > > >> > > > > > > > > >> Regarding, pt#2 , I guess it might be better to forward to > > the > > > > > > leader > > > > > > > as > > > > > > > > >> suggested by Yash. Having said that, why does the worker > > > forward > > > > > to > > > > > > > the > > > > > > > > >> leader? I am thinking if the worker can perform the > > validation > > > > on > > > > > > it's > > > > > > > > >> own, > > > > > > > > >> we could let it do the validation instead of forwarding > > > > everything > > > > > > to > > > > > > > > the > > > > > > > > >> leader(even though it might be cheap to forward all > requests > > > to > > > > > the > > > > > > > > >> leader). > > > > > > > > >> > > > > > > > > >> Pt#3 => I think a bound is certainly needed but IMO it > > > shouldn't > > > > > go > > > > > > > > beyond > > > > > > > > >> 10 mins considering this is just validation. We shouldn't > > end > > > up > > > > > in > > > > > > a > > > > > > > > >> situation where a few faulty connectors end up blocking a > > lot > > > of > > > > > > > request > > > > > > > > >> processing threads, so while increasing the config is > > > certainly > > > > > > > helpful, > > > > > > > > >> we > > > > > > > > >> shouldn't set too high a value IMO. Of course I am also > open > > > to > > > > > > > > >> suggestions > > > > > > > > >> here. > > > > > > > > >> > > > > > > > > >> Thanks! > > > > > > > > >> Sagar. > > > > > > > > >> > > > > > > > > >> On Thu, Nov 3, 2022 at 9:01 PM Chris Egerton > > > > > > <chr...@aiven.io.invalid > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > >> > Hi Yash, > > > > > > > > >> > > > > > > > > > >> > RE 2: That's a great point about validations already > being > > > > > > performed > > > > > > > > by > > > > > > > > >> the > > > > > > > > >> > leader. For completeness's sake, I'd like to note that > > this > > > > only > > > > > > > holds > > > > > > > > >> for > > > > > > > > >> > valid configurations; invalid ones are caught right now > > > before > > > > > > being > > > > > > > > >> > forwarded to the leader. Still, I think it's fine to > > forward > > > > to > > > > > > the > > > > > > > > >> leader > > > > > > > > >> > for now and optimize further in the future if necessary. > > If > > > > > > frequent > > > > > > > > >> > validations are taking place they should be conducted > via > > > the > > > > > `PUT > > > > > > > > >> > /connector-plugins/{pluginName}/config/validate` > endpoint, > > > > which > > > > > > > won't > > > > > > > > >> do > > > > > > > > >> > any forwarding at all. > > > > > > > > >> > > > > > > > > > >> > RE 3: Yes, those endpoints LGTM. And yes, bounds on the > > > > timeout > > > > > > also > > > > > > > > >> seem > > > > > > > > >> > reasonable... maybe a low-importance worker property > could > > > > work > > > > > > for > > > > > > > > >> that? > > > > > > > > >> > Not sure what would make sense for a default; probably > > > > somewhere > > > > > > in > > > > > > > > the > > > > > > > > >> > 10-60 minute range but would be interested in others' > > > > thoughts. > > > > > > > > >> > > > > > > > > > >> > Thanks for the clarification on the zombie fencing > logic. > > I > > > > > think > > > > > > we > > > > > > > > >> might > > > > > > > > >> > want to have some more subtle logic around the > interaction > > > > > between > > > > > > > > >> calls to > > > > > > > > >> > Admin::fenceProducers and a worker-level timeout > property > > if > > > > we > > > > > go > > > > > > > > that > > > > > > > > >> > route, but we can cross that particular bridge if we get > > > back > > > > to > > > > > > it. > > > > > > > > >> > > > > > > > > > >> > Cheers, > > > > > > > > >> > > > > > > > > > >> > Chris > > > > > > > > >> > > > > > > > > > >> > On Wed, Nov 2, 2022 at 1:48 PM Yash Mayya < > > > > yash.ma...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > >> > > > > > > > > > >> > > Hi Chris, > > > > > > > > >> > > > > > > > > > > >> > > Thanks a lot for the super quick response and the > great > > > > > > feedback! > > > > > > > > >> > > > > > > > > > > >> > > 1. I think that makes a lot of sense, and I'd be happy > > to > > > > > update > > > > > > > the > > > > > > > > >> KIP > > > > > > > > >> > to > > > > > > > > >> > > include this change in the scope. The current behavior > > > where > > > > > the > > > > > > > API > > > > > > > > >> > > response indicates a time out but the connector is > > > > > > created/updated > > > > > > > > >> > > eventually anyway can be pretty confusing and is > > generally > > > > > not a > > > > > > > > good > > > > > > > > >> > user > > > > > > > > >> > > experience IMO. > > > > > > > > >> > > > > > > > > > > >> > > 2. Wow, thanks for pointing this out - it's a really > > good > > > > > catch > > > > > > > and > > > > > > > > >> > > something I hadn't noticed was happening with the > > current > > > > > > > > >> implementation. > > > > > > > > >> > > While I do like the idea of having a query parameter > > that > > > > > > > determines > > > > > > > > >> > > whether validations can be skipped, I'm wondering if > it > > > > might > > > > > > not > > > > > > > be > > > > > > > > >> > easier > > > > > > > > >> > > and cleaner to just do the leader check earlier and > > avoid > > > > > doing > > > > > > > the > > > > > > > > >> > > unnecessary config validation on the first worker? > Since > > > > each > > > > > > > config > > > > > > > > >> > > validation happens on its own thread, I'm not so sure > > > about > > > > > the > > > > > > > > >> concern > > > > > > > > >> > of > > > > > > > > >> > > overloading the leader even on larger clusters, > > especially > > > > > since > > > > > > > > >> > > validations aren't typically long running operations. > > > > > > Furthermore, > > > > > > > > >> even > > > > > > > > >> > > with the current implementation, the leader will > always > > be > > > > > > doing a > > > > > > > > >> config > > > > > > > > >> > > validation for connector create / update REST API > > requests > > > > on > > > > > > any > > > > > > > > >> worker. > > > > > > > > >> > > > > > > > > > > >> > > 3. That's a good point, and this way we can also > > restrict > > > > the > > > > > > APIs > > > > > > > > >> whose > > > > > > > > >> > > timeouts are configurable - I'm thinking `PUT > > > > > > > > >> > > /connector-plugins/{pluginName}/config/validate`, > `POST > > > > > > > /connectors` > > > > > > > > >> and > > > > > > > > >> > > `PUT /connectors/{connector}/config` are the ones > where > > > > such a > > > > > > > > timeout > > > > > > > > >> > > parameter could be useful. Also, do you think we > should > > > > > enforce > > > > > > > some > > > > > > > > >> > > reasonable bounds for the timeout config? > > > > > > > > >> > > > > > > > > > > >> > > On the zombie fencing point, the implication was that > > the > > > > new > > > > > > > worker > > > > > > > > >> > > property would not control the timeout used for the > call > > > to > > > > > > > > >> > > Admin::fenceProducers. However, if we go with a > timeout > > > > query > > > > > > > > >> parameter > > > > > > > > >> > > approach, even the timeout for the `PUT > > > > > > > > /connectors/{connector}/fence' > > > > > > > > >> > > endpoint will remain unaffected. > > > > > > > > >> > > > > > > > > > > >> > > Thanks, > > > > > > > > >> > > Yash > > > > > > > > >> > > > > > > > > > > >> > > On Wed, Nov 2, 2022 at 8:13 PM Chris Egerton > > > > > > > > <chr...@aiven.io.invalid > > > > > > > > >> > > > > > > > > > >> > > wrote: > > > > > > > > >> > > > > > > > > > > >> > > > Hi Yash, > > > > > > > > >> > > > > > > > > > > > >> > > > Thanks for the KIP. It's a nice, focused change. > > > > Initially I > > > > > > was > > > > > > > > >> > hesitant > > > > > > > > >> > > > to support cases where connector validation takes > this > > > > long, > > > > > > but > > > > > > > > >> > > > considering the alternative is that we give users a > > 500 > > > > > error > > > > > > > > >> response > > > > > > > > >> > > but > > > > > > > > >> > > > leave the request to create/modify the connector > > queued > > > up > > > > > in > > > > > > > the > > > > > > > > >> > > herder, I > > > > > > > > >> > > > think I can get behind the motivation here. There's > > also > > > > an > > > > > > > > >> argument to > > > > > > > > >> > > be > > > > > > > > >> > > > made about keeping Kafka Connect available even when > > the > > > > > > systems > > > > > > > > >> that > > > > > > > > >> > it > > > > > > > > >> > > > connects to are in a degraded state. > > > > > > > > >> > > > > > > > > > > > >> > > > I have a few alternatives I'd be interested in your > > > > thoughts > > > > > > on: > > > > > > > > >> > > > > > > > > > > > >> > > > 1. Since the primary concern here seems to be that > > > custom > > > > > > > > connector > > > > > > > > >> > > > validation logic can take too long, do we have any > > > > thoughts > > > > > on > > > > > > > > >> adding > > > > > > > > >> > > logic > > > > > > > > >> > > > to check for request timeout after validation has > > > > completed > > > > > > and, > > > > > > > > if > > > > > > > > >> it > > > > > > > > >> > > has, > > > > > > > > >> > > > aborting the attempt to create/modify the connector? > > > > > > > > >> > > > > > > > > > > > >> > > > 2. Right now it's possible that we'll perform two > > > > connector > > > > > > > config > > > > > > > > >> > > > validations per create/modify request; once on the > > > worker > > > > > that > > > > > > > > >> > initially > > > > > > > > >> > > > receives the request, and then again if that worker > is > > > not > > > > > the > > > > > > > > >> leader > > > > > > > > >> > of > > > > > > > > >> > > > the cluster and has to forward the request to the > > > leader. > > > > > Any > > > > > > > > >> thoughts > > > > > > > > >> > on > > > > > > > > >> > > > optimizing this to only require a single validation > > per > > > > > > request? > > > > > > > > We > > > > > > > > >> > > > probably wouldn't want to force all validations to > > take > > > > > place > > > > > > on > > > > > > > > the > > > > > > > > >> > > leader > > > > > > > > >> > > > (could lead to overloading it pretty quickly in > large > > > > > > clusters), > > > > > > > > >> but we > > > > > > > > >> > > > could add an internal-only query parameter to skip > > > > > validation > > > > > > > and > > > > > > > > >> then > > > > > > > > >> > > use > > > > > > > > >> > > > that parameter when forwarding requests from > followers > > > to > > > > > the > > > > > > > > >> leader. > > > > > > > > >> > > > > > > > > > > > >> > > > 3. A worker property is pretty coarse-grained, and > > > > difficult > > > > > > to > > > > > > > > >> change. > > > > > > > > >> > > We > > > > > > > > >> > > > might allow per-request toggling of the timeout by > > > adding > > > > a > > > > > > URL > > > > > > > > >> query > > > > > > > > >> > > > parameter like '?timeout=90s' to the REST API to > allow > > > > > > tweaking > > > > > > > of > > > > > > > > >> the > > > > > > > > >> > > > timeout on a more granular basis, and without having > > to > > > > > > perform > > > > > > > a > > > > > > > > >> > worker > > > > > > > > >> > > > restart. > > > > > > > > >> > > > > > > > > > > > >> > > > I'd also like to clarify a point about the rejected > > > > > > alternative > > > > > > > > >> "Allow > > > > > > > > >> > > > configuring producer zombie fencing admin request > > > > > timeout"--is > > > > > > > the > > > > > > > > >> > > > implication here that the " > > rest.api.request.timeout.ms" > > > > > > > property > > > > > > > > >> will > > > > > > > > >> > > not > > > > > > > > >> > > > control the REST timeout for requests to the 'PUT > > > > > > > > >> > > > /connectors/{connector}/fence' endpoint, or just > that > > it > > > > > won't > > > > > > > > >> control > > > > > > > > >> > > the > > > > > > > > >> > > > timeout that we use for the call to > > > Admin::fenceProducers? > > > > > > > > >> > > > > > > > > > > > >> > > > Cheers, > > > > > > > > >> > > > > > > > > > > > >> > > > Chris > > > > > > > > >> > > > > > > > > > > > >> > > > On Wed, Nov 2, 2022 at 10:07 AM Yash Mayya < > > > > > > > yash.ma...@gmail.com> > > > > > > > > >> > wrote: > > > > > > > > >> > > > > > > > > > > > >> > > > > Hi all, > > > > > > > > >> > > > > > > > > > > > > >> > > > > I'd like to start a discussion thread on this > small > > > KIP > > > > - > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://eu01.z.antigena.com/l/8mqVuSEhdZF05LKGk7J1Cb6esUWht6ps4-73ocd196fgyxC~lGhTLqBEt9BPJy1MBx6RzKVi7qRYYpJwCHRixlmdcayj7YMv~-0fIZJ2f8EjEhp-Qck3~O3Ga1JBTgaY4tTAlwh0VjF6217iBaGVG2PNa-2TO5z8zC5s0Nb28kcG2mUg1tcEfzsAtmWEXWRECoeD4cdUXyPvWeFtYL-ccVv-PmQydgSIYkdeR7~XItAXezft9Mfs5pzjejXSK~MhWW > > > > > > > > >> > > > > > > > > > > > > >> > > > > It proposes the addition of a new Kafka Connect > > worker > > > > > > > > >> configuration > > > > > > > > >> > to > > > > > > > > >> > > > > allow configuring REST API request timeouts. > > > > > > > > >> > > > > > > > > > > > > >> > > > > Thanks, > > > > > > > > >> > > > > Yash > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >