Hi devs,

While working on SOLR-17869 two things occurred to me:

1. Discounting the `timeAllowed`

When `timeAllowed` is > 0 `TopGroupsShardRequestFactory` discounts the time 
already spent at the query coordinator when creating shard requests. This 
happens ONLY for the grouping queries, for all other queries `timeAllowed` is 
passed as-is to shard requests.

The net effect is that for other query types the same value of `timeAllowed` is 
applied to the processing at the coordinator as well as to the per-shard local 
processing. Since `timeAllowed` is a wall-clock time relative to the time of 
the local request, not discounting the time already spent increases the 
likelihood of the query timing out at the coordinator (because even if shard 
requests all fit within the `timeAllowed` *as started at each shard*, it still 
takes some time to return the results and post-process them at the coordinator 
where timeAllowed was started much earlier).

From this POV it seems that `timeAllowed` should always be discounted before 
sending shard requests, for all types of queries, which is not the case today.

Another question comes to mind, too: should we discount other types of query 
limits, too? For other limit types the same “budget” is applied both at the 
coordinator and locally per-shard - however, CPU and memory limits are not tied 
to an absolute point in time so maybe it doesn’t make sense to discount them by 
the amount spent at the coordinator, unless we want to limit the “end-to-end” 
impact of a query across both local and distrib sub-requests. WDYT?


2. QueryCommand.timeAllowed

During the upgrade to Lucene 10 a few blocks of code were commented out in 
SolrIndexSearcher, and one of them was this (SolrIndexSearcher:309):

/* final long timeAllowed = cmd.getTimeAllowed();
if (timeAllowed > 0) {
  setTimeout(new QueryTimeoutImpl(timeAllowed));
}*/

I was surprised that none of the tests failed with this change and started 
digging. QueryCommand.timeAllowed is initialized from request params. But it 
looks like this property is no longer needed because now for each request we 
initialize the QueryLimits from the same request params, which serve the same 
purpose.

What’s more, I think the commented-out section was a bug because it would call 
`IndexSearcher.setTimeout` on a searcher instance that was used for multiple 
queries, and that setting would persist even for other subsequent requests.

I propose to remove `QueryCommand.timeAllowed` altogether. 

—

Andrzej Białecki


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to