Gus, Hmm… this sounds better than using NOW, which may be significantly off if nodes are out of sync and as a result could trigger immediate termination of sub-requests.
The downside to this approach is that we can’t properly account for the in-flight time (between tagging and sending a shard request up to the receiving node starting the processing) but that seems a small price to pay and it should only introduce a small and consistent error. We could add this logic somewhere in HttpSolrCall.init() instead of a servlet filter, because at this point it’s still very early into the request processing but we already have the request params. WDYT? > On 22 Sep 2025, at 23:43, Gus Heck <[email protected]> wrote: > > For timeAllowed, the proper design would seem to be to establish a "must > finish by" time, at initial query receipt, and then pass that to the > shards. The fly in that ointment would be clock skew, so one probably needs > to also send along a notion of when the sub-request was made, which the > recipient can compare to their own clock. This is similar to discounting > but would implicitly allow network request time to be "free" (untimed), but > the advantage is that it also leaves the option that in a cluster where > nodes run on machines with trustworthy clocks the skew check can be > skipped, and skew checking code can be independent of all other request > processing code. > > Main request: timeAllowed=*1*0000 > Coordinator clock at request receipt: 17585758*5*06*05* > Subrequest finishBefore=17585758*6*06*05* > > Optionally on initial request: mitigateClockSkew=true (this could also > default to true if we like) > which then causes the subrequest to also contain: clockSkewCheck=17585758*5* > 06*73* (assuming it took 68 ms to decide to send the sub-request) > > if the clockSkewCheck is sent the receiving node checks the supplied value > against its local clock and applies the (positive or negative) difference > to the finishBefore value. > > That way all logic is simple and consistent for checking the limit (just > compare to current time), systems that have synchronized clocks can benefit > from it, and most importantly, the skew adjustment code can be a super > simple HttpServletFilter that doesn't need to be worked into any of the > existing code paths. > > Then we could also allow the user to send the "finishBefore" to the > coordinator if the original requesting system also has a reliably > synchronized clock. > > On Mon, Sep 22, 2025 at 10:21 AM David Smiley <[email protected]> wrote: > >> I think it definitely makes sense to discount timeAllowed for the shard >> requests. I was thinking about this and related matters recently at my >> last job. I recall thinking that a possible solution is to use the NOW >> passed along from coordinator to shard as the epoch to base the time >> allowance on. It's captured early by the coordinator. On the other hand, >> it could be considered a hack / abuse of the intent of NOW. Maybe a >> separate parameter (or HTTP header) should be used to pass this along. >> >> I don't think it's worth bothering doing this for memory or CPU limits. >> But I'm not against it. >> >> On Mon, Sep 22, 2025 at 10:11 AM Andrzej Białecki <[email protected]> wrote: >> >>> Hi devs, >>> >>> While working on SOLR-17869 two things occurred to me: >>> >>> 1. Discounting the `timeAllowed` >>> >>> When `timeAllowed` is > 0 `TopGroupsShardRequestFactory` discounts the >>> time already spent at the query coordinator when creating shard requests. >>> This happens ONLY for the grouping queries, for all other queries >>> `timeAllowed` is passed as-is to shard requests. >>> >>> The net effect is that for other query types the same value of >>> `timeAllowed` is applied to the processing at the coordinator as well as >> to >>> the per-shard local processing. Since `timeAllowed` is a wall-clock time >>> relative to the time of the local request, not discounting the time >> already >>> spent increases the likelihood of the query timing out at the coordinator >>> (because even if shard requests all fit within the `timeAllowed` *as >>> started at each shard*, it still takes some time to return the results >> and >>> post-process them at the coordinator where timeAllowed was started much >>> earlier). >>> >>> From this POV it seems that `timeAllowed` should always be discounted >>> before sending shard requests, for all types of queries, which is not the >>> case today. >>> >>> Another question comes to mind, too: should we discount other types of >>> query limits, too? For other limit types the same “budget” is applied >> both >>> at the coordinator and locally per-shard - however, CPU and memory limits >>> are not tied to an absolute point in time so maybe it doesn’t make sense >> to >>> discount them by the amount spent at the coordinator, unless we want to >>> limit the “end-to-end” impact of a query across both local and distrib >>> sub-requests. WDYT? >>> >>> >>> 2. QueryCommand.timeAllowed >>> >>> During the upgrade to Lucene 10 a few blocks of code were commented out >> in >>> SolrIndexSearcher, and one of them was this (SolrIndexSearcher:309): >>> >>> /* final long timeAllowed = cmd.getTimeAllowed(); >>> if (timeAllowed > 0) { >>> setTimeout(new QueryTimeoutImpl(timeAllowed)); >>> }*/ >>> >>> I was surprised that none of the tests failed with this change and >> started >>> digging. QueryCommand.timeAllowed is initialized from request params. But >>> it looks like this property is no longer needed because now for each >>> request we initialize the QueryLimits from the same request params, which >>> serve the same purpose. >>> >>> What’s more, I think the commented-out section was a bug because it would >>> call `IndexSearcher.setTimeout` on a searcher instance that was used for >>> multiple queries, and that setting would persist even for other >> subsequent >>> requests. >>> >>> I propose to remove `QueryCommand.timeAllowed` altogether. >>> >>> — >>> >>> Andrzej Białecki >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> > > > -- > http://www.needhamsoftware.com (work) > https://a.co/d/b2sZLD9 (my fantasy fiction book) --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
