[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2025-06-20 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985044#comment-17985044
 ] 

Houston Putman commented on SOLR-17158:
---

Also, as [~dsmiley] mentioned, there is significant overlap between 
{{shards.tolerant}} and {{{}allowPartialResults{}}}. I do prefer the naming of 
the second one, and ultimately think it should probably be used instead at some 
point in the future, but there should be coherent (unified) logic to determine 
whether we should cancel outstanding requests and stop waiting for new ones. 
Currently those two actions are governed by the two options (mentioned above), 
and it doesn't make any sense.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2025-06-20 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985000#comment-17985000
 ] 

Houston Putman commented on SOLR-17158:
---

Hey all, I think we need to rethink the synchronization here, since this change 
has caused massive performance issues when using the ParallelHttpShardHandler.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-27 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17893283#comment-17893283
 ] 

Gus Heck commented on SOLR-17158:
-

I hadn't been thinking carefully about or paying attention to partial results 
at that time unfortunately. As you can see I don't agree with that direction. I 
notice at the start of this you mention a concept of a configurable limit. 
Perhaps adding such a parameter is the best compromise. 
{{&minShardResponses=1}} can be a default for 9x and default to 0 (no error) in 
10? It would be defined to have no meaning unless a partial response is 
returned.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-20 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891263#comment-17891263
 ] 

Gus Heck commented on SOLR-17158:
-

Also adding metadata about what fraction of shards completed seems like a 
reasonable follow on feature, and full info about which in the debug case... 
but one of the things I think is difficult about failover type behavior here is 
that there are several types of failures:
 # The Limit was just too small, even a healthy server can't answer in the 
allotted time/space (this is a 4xx type of case if an error is to be thrown)
 # The query is unreasonable, and even a healthy server can't answer it in the 
allotted time/space (this is a 4xx type of case if an error is to be thrown)
 # The query and Limit are reasonable, but the (these are 5xx like cases if an 
error is to be thrown)
 ## The cluster is under extreme load and thus all shards are going to be 
unable to answer
 ## This individual node is under extreme load and an alternative node might 
answer.

In every case except 3.2 repeating the request is harmful. The code already 
detects and retries if the http communication fails, but adding this 
timeAllowed parameter means that we can effectively hide from that retry code. 
If 3.1 is the usual problem that would be a good thing, and if 3.2 is most 
common that's not so good. In the case where zero shards responded 3.1 seems 
much more likely. So after pondering all this for a long time, I've come to the 
thought that throwing exceptions or otherwise using the response to gauge 
server health is a poor substitute for system monitoring. So certainly the 
metadata you suggest might be nice to see for troubleshooting, but I'm leery of 
the notion that it might be used for automated fail-over / fall-back.

Also as you can see if we start throwing errors, we have no way to decide what 
error to throw... 4xx says "user, you must change your request" and 5xx says 
"come back later with that request, we've got problems"... So this is another 
part of why I settled on 200 OK, YOU ASKED FOR IT ;)

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-20 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891261#comment-17891261
 ] 

Gus Heck commented on SOLR-17158:
-

One of the big difficulties is the rest of the code can't handle the case where 
no shard response has been received. There seemed to be multiple places where 
that hasn't been considered. One prominent example is that because then 
mergeIds never gets called and {{_responseDocs}} is never initialized in 
ResponseBuilder... then later several bits of code try to use getResponseDocs 
without a null check and blow up... I think the first one that gets hit is: 
{{org.apache.solr.handler.component.QueryComponent#regularFinishStage }}where 
it tries to get an iterator... I'm going to guess this or something like it is 
the NPE you found when cherry picking  
[https://github.com/apache/solr/pull/2493] I tried adding an initialization of 
that field, but then I soon hit something else uninitialized and I became 
scared of trying to also reform something as central and widely used as 
ResponseBuilder at the same time. My conclusion was that it was critical in the 
code as it exists to ensure that either at least one shard response is received 
or srsp.setException is called. There is a test that intentionally shuts down 
all shards that hits this kind of case unless recordNoUrlShardResponse() is 
called. (at one point I tried to eliminate the exception creation there but 
then found out it existed to later be thrown and thus prevent finishStage() 
from being called on components when all shards fail...).

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-19 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891187#comment-17891187
 ] 

David Smiley commented on SOLR-17158:
-

I sympathize with the 200 response; it's reasonable and also debatable though.  
If we could supply a bit of metadata in the header for the caller to know what 
percentage of sub-shards timed out, the caller might want to know to act on 
this (i.e. do fallback / failing scenario actions). 

I take no issue with the subShards using the full 5 seconds for this 
unrealistic sleep(5000,0) call.  But the coordinator need not wait on the 
take() indefinitely; it should wait no more than the 10 milliseconds given.  I 
think it's sad if we must wait for one sub-shard request to respond; feels 
needless to me; a deficiency.  I suppose you might say this shouldn't even 
happen because the sub-shard *should* not take more than 10ms itself.   I wish 
it didn't but where I work we see it can happen (via SOLR-17320).  Hmmm; I'm 
not sure what action to take.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-19 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891152#comment-17891152
 ] 

Gus Heck commented on SOLR-17158:
-

I think that test you suggest defeats the code's efforts by putting all shard 
queries into a 5 second sleep? All the Limit code relies on the thread 
executing the query to check itself. There's no supervisor thread, so if the 
thread is forced into a sleep, it can't check itself until it wakes up. It's 
not a very realistic test because typically there will be variation among 
shards, and in conjunction with the searcher level checks it takes a pretty 
interesting scenario for all shards to get stuck simultaneously within the 
Lucene searcher code, or a loop that has no check. This ticket mostly ensures 
that once one thread times out, the effort/time spent waiting on the remaining 
threads is minimized. One could imagine designs for timeAllowed where the 
coordinator is completely divorced from the shard processing has a 
countdownLatch that waits with a timeout and simply returns whatever happens to 
be available when that expires, but that's not how things currently work and 
relative to CPU and Memory limits that's a special case.

As for exceptions I've tried to avoid special casing like you suggest, and 
stick to a simple clear rule: One should only get an exception if the server 
*refused (4xx)* or *failed(5xx)* the task you gave it. When you add a limit you 
give it a task that would be phrased as: "Spend X effort searching this query 
and if it takes more effort than that (do|don't) tell me the documents you 
managed to find" In the case you describe where the request limit is 
unreasonably short and nothing is found, the server has certainly not refused, 
nor has it failed. The server will set the partial results attribute to 
indicate that there might be more results. If there's any issue, it might be 
the naming of "partial results" since that name sort-of sounds like there must 
be some results. Perhaps a more communicative response would say
{code:java}
"resultMode":"bestEffort",
"resultStatus":"resultsIncomplete" {code}
or 
{code:java}
"resultMode":"allOrNothing",
"resultStatus":"incompleteResultsOmitted"{code}
 

or
{code:java}
"resultMode":"bestEffort",
"resultStatus":"resultsComplete" {code}
or 
{code:java}
"resultMode":"allOrNothing",
"resultStatus":"resultsComplete"{code}
However that's a much more backwards incompatible change.

I have thought it would be much nicer if there were an HTTP status code that 
could be set to indicate partial results without having to parse the response, 
and the closest is 
[206|https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range], but 
that's really meant for [Range 
Requests|https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range] and 
since we neither use the associated headers nor have a way to express, or even 
satisfy "give me the second 30 seconds of results" it's not a good fit. We 
could just co-opt our own 2xx number and document it, perhaps even propose it 
(since results limited by processing constraints probably could be considered a 
generic situation... maybe {{225 LIMITED RESULTS}} ?), but that may cause 
problems for current users and is backwards incompatible for folks who check 
for {{200 OK}} specifically.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-18 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891053#comment-17891053
 ] 

David Smiley commented on SOLR-17158:
-

Question:  if all shards time out and if shards.tolerant=true, then we don't 
truly have partial results to return to the user.  What should we do?  I argue 
failing is appropriate instead of 200 despite shards.tolerant.

 I created the following test method on TestDistributedSearch (where 
shards.tolerant is tested):
{code:java}
  public void testTimeAllowed() throws Exception {
    var cluster = new MiniSolrCloudCluster.Builder(1, createTempDir())
                      .addConfig("conf", 
configset("cloud-minimal")).configure();
    CloudSolrClient client = cluster.getSolrClient();
    String COLLECTION_NAME = "test_coll";
    CollectionAdminRequest.createCollection(COLLECTION_NAME, "conf", 2, 1)
        .setRouterName("implicit")
        .setShards("shard_1,shard_2")
        .process(cluster.getSolrClient());
    cluster.waitForActiveCollection(COLLECTION_NAME, 2, 2);    var query = new 
SolrQuery();
    query.setQuery("{!func}sleep(5000,0)");
    query.set(CommonParams.TIME_ALLOWED, 10);
    query.set(ShardParams.SHARDS_TOLERANT, "true");
    var response = client.query(COLLECTION_NAME, query);
    System.out.println(response);
    assertTrue(response.getElapsedTime() < 1000); // definitely less than a 
second
    cluster.shutdown();
  } {code}
And it fails, not actually timing out in the 10 milliseconds given for 
timeAllowed.  I expected this issue here would make this pass?

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890108#comment-17890108
 ] 

ASF subversion and git services commented on SOLR-17158:


Commit 6dbd8d1b8e0e1690b048e32bf152837b5a341fee in solr's branch 
refs/heads/branch_9x from Gus Heck
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=6dbd8d1b8e0 ]

SOLR-17158 Terminate distributed processing faster when query limit is reached 
and partial results are not needed (#2666) (#2773)

(cherry picked from commit b748207bc331b5eeae284ee7602626dbe5e3ff50)

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-10-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886290#comment-17886290
 ] 

ASF subversion and git services commented on SOLR-17158:


Commit b748207bc331b5eeae284ee7602626dbe5e3ff50 in solr's branch 
refs/heads/main from Gus Heck
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=b748207bc33 ]

SOLR-17158 Terminate distributed processing faster when query limit is reached 
and partial results are not needed (#2666)



> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-04-04 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833972#comment-17833972
 ] 

Gus Heck commented on SOLR-17158:
-

Even prior to this change success/fail is a more limited concept than the 
actual behavior... Requests may:

Fail - due to syntax, etc (400x) 

Fail - due to server issues (500x)

Fail - due to upstream node server issues (500x) (shards.tolerant not set)

Succeed, returning all results (200)

Succeed, returning partial results due to any of the following cases where solr 
is doing what we told it to
 * Query Limit (timeAllowed, cpuAllowed) (200)
 * Upstream node server returns 500 and shards.tolerant=true  (200)
 * Query cancellation (200)

All of those rely on the partialResults attribute, and the code has logic that 
relies on the attribute after we no longer know why it was set, so this 
parameter will have the effect of preventing partial results for all of these 
cases. I have a specific check to return an error (400, bad request) if this 
attribute is set at the same time as shards.tolerant because those to notions 
clearly conflict.

 

 

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-04-04 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833944#comment-17833944
 ] 

Andrzej Bialecki commented on SOLR-17158:
-

[~dsmiley] these are not exactly equivalent - when a limit is reached it 
doesn't have to be related in any way to per-shard processing.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-04-03 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833722#comment-17833722
 ] 

David Smiley commented on SOLR-17158:
-

Fail vs Success should be based on {{shards.tolerant}} -- 
https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-tolerant-parameter

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-02-26 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820781#comment-17820781
 ] 

Andrzej Bialecki commented on SOLR-17158:
-

I'm not convinced we need a sysprop here... why shouldn't we use request 
handler's {{defaults}} and {{invariants}} sections in {{solrconfig.xml}} ? 
Using a sysprop effectively enforces the same default behavior for all replicas 
of all collections managed by this Solr node.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-02-26 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820719#comment-17820719
 ] 

Gus Heck commented on SOLR-17158:
-

{quote}FYI, it was necessary to add this parameter in SOLR-17172, I used 
{{partialResults=true}} to mean that we should stop processing and return 
partial results with "success" code and "partialResults" flag in the response, 
and {{partialResults=false}} to mean that we should throw an exception and 
discard any partial results.
{quote}
I agree that the request should have a parameter to control this behavior as 
well as providing a syprop to determine the result. I am setting it up such 
that the parameter takes precedence over the syprop which only determines 
default behavior if the parameter is not supplied.  May be a complicated merge. 
I've got things like this
{code:java}
if (thereArePartialResults && !rb.req.shouldDiscardPartials()) { {code}
That method will want to be use everywhere because it encapsulates logic for 
both syprop and params.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-02-23 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820030#comment-17820030
 ] 

Andrzej Bialecki commented on SOLR-17158:
-

FYI, it was necessary to add this parameter in SOLR-17172, I used 
{{partialResults=true}} to mean that we should stop processing and return 
partial results with "success" code and "partialResults" flag in the response, 
and {{partialResults=false}} to mean that we should throw an exception and 
discard any partial results.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached

2024-02-21 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819319#comment-17819319
 ] 

Andrzej Bialecki commented on SOLR-17158:
-

Adding some observations from reading the code in {{{}SolrIndexSearcher}} and 
{{HttpShardHandler}}.

It appears that currently when {{timeAllowed}} is reached it doesn’t cause 
termination of all other pending shard requests. I found this section in 
{{SolrIndexSearcher:284}}:

{{  try {}}
{{      super.search(query, collector);}}
{{    } catch (TimeLimitingCollector.TimeExceededException}}
{{        | ExitableDirectoryReader.ExitingReaderException}}
{{        | CancellableCollector.QueryCancelledException x) {}}
{{      log.warn("Query: [{}]; ", query, x);}}
{{      qr.setPartialResults(true);}}

In the case when it reaches {{timeAllowed}} limit (and our new {{QueryLimits}}, 
too) it simply sets {{partialResults=true}} and does NOT throw any exception, 
so all the layers above think that the result is a success.

I suspect the reason for this was that when {{timeAllowed}} was set we still 
wanted to retrieve partial results when the limit was hit, and throwing an 
exception here would prevent that.

OTOH, if we had a request param saying “discard everything when you reach a 
limit and cancel any ongoing requests” then we could throw an exception here, 
and {{ShardHandler}} would recognize this as an error and cancel all other 
shard requests that are still pending, so that replicas could avoid sending 
back their results that would be discarded anyway.

> Terminate distributed processing quickly when query limit is reached
> 
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
>  Issue Type: Sub-task
>  Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results 
> are not needed (and not wanted) then both the processing in shards and in the 
> query coordinator should be terminated as quickly as possible, and Solr 
> should minimize wasted resources spent on eg. returning data from the 
> remaining shards, merging responses in the coordinator, or returning any data 
> back to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]