[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

Gus Heck (Jira) Sat, 19 Oct 2024 06:24:18 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891152#comment-17891152
 ]


Gus Heck commented on SOLR-17158:
---------------------------------

I think that test you suggest defeats the code's efforts by putting all shard 
queries into a 5 second sleep? All the Limit code relies on the thread 
executing the query to check itself. There's no supervisor thread, so if the 
thread is forced into a sleep, it can't check itself until it wakes up. It's 
not a very realistic test because typically there will be variation among 
shards, and in conjunction with the searcher level checks it takes a pretty 
interesting scenario for all shards to get stuck simultaneously within the 
Lucene searcher code, or a loop that has no check. This ticket mostly ensures 
that once one thread times out, the effort/time spent waiting on the remaining 
threads is minimized. One could imagine designs for timeAllowed where the 
coordinator is completely divorced from the shard processing has a 
countdownLatch that waits with a timeout and simply returns whatever happens to 
be available when that expires, but that's not how things currently work and 
relative to CPU and Memory limits that's a special case.

As for exceptions I've tried to avoid special casing like you suggest, and 
stick to a simple clear rule: One should only get an exception if the server 
*refused (4xx)* or *failed(5xx)* the task you gave it. When you add a limit you 
give it a task that would be phrased as: "Spend X effort searching this query 
and if it takes more effort than that (do|don't) tell me the documents you 
managed to find" In the case you describe where the request limit is 
unreasonably short and nothing is found, the server has certainly not refused, 
nor has it failed. The server will set the partial results attribute to 
indicate that there might be more results. If there's any issue, it might be 
the naming of "partial results" since that name sort-of sounds like there must 
be some results. Perhaps a more communicative response would say
{code:java}
"resultMode":"bestEffort",
"resultStatus":"resultsIncomplete" {code}
or 
{code:java}
"resultMode":"allOrNothing",
"resultStatus":"incompleteResultsOmitted"{code}
 

or
{code:java}
"resultMode":"bestEffort",
"resultStatus":"resultsComplete" {code}
or 
{code:java}
"resultMode":"allOrNothing",
"resultStatus":"resultsComplete"{code}
However that's a much more backwards incompatible change.

I have thought it would be much nicer if there were an HTTP status code that 
could be set to indicate partial results without having to parse the response, 
and the closest is 
[206|https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range], but 
that's really meant for [Range 
Requests|https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range] and 
since we neither use the associated headers nor have a way to express, or even 
satisfy "give me the second 30 seconds of results" it's not a good fit. We 
could just co-opt our own 2xx number and document it, perhaps even propose it 
(since results limited by processing constraints probably could be considered a 
generic situation... maybe {{225 LIMITED RESULTS}} ?), but that may cause 
problems for current users and is backwards incompatible for folks who check 
for {{200 OK}} specifically.

> Terminate distributed processing quickly when query limit is reached
> --------------------------------------------------------------------
>
>                 Key: SOLR-17158
>                 URL: https://issues.apache.org/jira/browse/SOLR-17158
>             Project: Solr
>          Issue Type: Sub-task
>          Components: Query Limits
>            Reporter: Andrzej Bialecki
>            Assignee: Gus Heck
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: main (10.0), 9.8
>
>          Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

Reply via email to