Gus,

Hmm… this sounds better than using NOW, which may be significantly off if nodes 
are out of sync and as a result could trigger immediate termination of 
sub-requests.

The downside to this approach is that we can’t properly account for the 
in-flight time (between tagging and sending a shard request up to the receiving 
node starting the processing) but that seems a small price to pay and it should 
only introduce a small and consistent error.

We could add this logic somewhere in HttpSolrCall.init() instead of a servlet 
filter, because at this point it’s still very early into the request processing 
but we already have the request params. WDYT?

> On 22 Sep 2025, at 23:43, Gus Heck <[email protected]> wrote:
> 
> For timeAllowed, the proper design would seem to be to establish a "must
> finish by" time, at initial query receipt, and then pass that to the
> shards. The fly in that ointment would be clock skew, so one probably needs
> to also send along a notion of when the sub-request was made, which the
> recipient can compare to their own clock. This is similar to discounting
> but would implicitly allow network request time to be "free" (untimed), but
> the advantage is that it also leaves the option that in a cluster where
> nodes run on machines with trustworthy clocks the skew check can be
> skipped, and skew checking code can be independent of all other request
> processing code.
> 
> Main request: timeAllowed=*1*0000
> Coordinator clock at request receipt: 17585758*5*06*05*
> Subrequest finishBefore=17585758*6*06*05*
> 
> Optionally on initial request: mitigateClockSkew=true (this could also
> default to true if we like)
> which then causes the subrequest to also contain: clockSkewCheck=17585758*5*
> 06*73* (assuming it took 68 ms to decide to send the sub-request)
> 
> if the clockSkewCheck is sent the receiving node checks the supplied value
> against its local clock and applies the (positive or negative) difference
> to the finishBefore value.
> 
> That way all logic is simple and consistent for checking the limit (just
> compare to current time), systems that have synchronized clocks can benefit
> from it, and most importantly, the skew adjustment code can be a super
> simple HttpServletFilter that doesn't need to be worked into any of the
> existing code paths.
> 
> Then we could also allow the user to send the "finishBefore" to the
> coordinator if the original requesting system also has a reliably
> synchronized clock.
> 
> On Mon, Sep 22, 2025 at 10:21 AM David Smiley <[email protected]> wrote:
> 
>> I think it definitely makes sense to discount timeAllowed for the shard
>> requests.  I was thinking about this and related matters recently at my
>> last job. I recall thinking that a possible solution is to use the NOW
>> passed along from coordinator to shard as the epoch to base the time
>> allowance on.  It's captured early by the coordinator. On the other hand,
>> it could be considered a hack / abuse of the intent of NOW.  Maybe a
>> separate parameter (or HTTP header) should be used to pass this along.
>> 
>> I don't think it's worth bothering doing this for memory or CPU limits.
>> But I'm not against it.
>> 
>> On Mon, Sep 22, 2025 at 10:11 AM Andrzej Białecki <[email protected]> wrote:
>> 
>>> Hi devs,
>>> 
>>> While working on SOLR-17869 two things occurred to me:
>>> 
>>> 1. Discounting the `timeAllowed`
>>> 
>>> When `timeAllowed` is > 0 `TopGroupsShardRequestFactory` discounts the
>>> time already spent at the query coordinator when creating shard requests.
>>> This happens ONLY for the grouping queries, for all other queries
>>> `timeAllowed` is passed as-is to shard requests.
>>> 
>>> The net effect is that for other query types the same value of
>>> `timeAllowed` is applied to the processing at the coordinator as well as
>> to
>>> the per-shard local processing. Since `timeAllowed` is a wall-clock time
>>> relative to the time of the local request, not discounting the time
>> already
>>> spent increases the likelihood of the query timing out at the coordinator
>>> (because even if shard requests all fit within the `timeAllowed` *as
>>> started at each shard*, it still takes some time to return the results
>> and
>>> post-process them at the coordinator where timeAllowed was started much
>>> earlier).
>>> 
>>> From this POV it seems that `timeAllowed` should always be discounted
>>> before sending shard requests, for all types of queries, which is not the
>>> case today.
>>> 
>>> Another question comes to mind, too: should we discount other types of
>>> query limits, too? For other limit types the same “budget” is applied
>> both
>>> at the coordinator and locally per-shard - however, CPU and memory limits
>>> are not tied to an absolute point in time so maybe it doesn’t make sense
>> to
>>> discount them by the amount spent at the coordinator, unless we want to
>>> limit the “end-to-end” impact of a query across both local and distrib
>>> sub-requests. WDYT?
>>> 
>>> 
>>> 2. QueryCommand.timeAllowed
>>> 
>>> During the upgrade to Lucene 10 a few blocks of code were commented out
>> in
>>> SolrIndexSearcher, and one of them was this (SolrIndexSearcher:309):
>>> 
>>> /* final long timeAllowed = cmd.getTimeAllowed();
>>> if (timeAllowed > 0) {
>>>  setTimeout(new QueryTimeoutImpl(timeAllowed));
>>> }*/
>>> 
>>> I was surprised that none of the tests failed with this change and
>> started
>>> digging. QueryCommand.timeAllowed is initialized from request params. But
>>> it looks like this property is no longer needed because now for each
>>> request we initialize the QueryLimits from the same request params, which
>>> serve the same purpose.
>>> 
>>> What’s more, I think the commented-out section was a bug because it would
>>> call `IndexSearcher.setTimeout` on a searcher instance that was used for
>>> multiple queries, and that setting would persist even for other
>> subsequent
>>> requests.
>>> 
>>> I propose to remove `QueryCommand.timeAllowed` altogether.
>>> 
>>> —
>>> 
>>> Andrzej Białecki
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>> 
>>> 
>> 
> 
> 
> -- 
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to