[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985044#comment-17985044
]
Houston Putman commented on SOLR-17158:
---
Also, as [~dsmiley] mentioned, there is significant overlap between
{{shards.tolerant}} and {{{}allowPartialResults{}}}. I do prefer the naming of
the second one, and ultimately think it should probably be used instead at some
point in the future, but there should be coherent (unified) logic to determine
whether we should cancel outstanding requests and stop waiting for new ones.
Currently those two actions are governed by the two options (mentioned above),
and it doesn't make any sense.
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
> Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
> Time Spent: 8h 50m
> Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985000#comment-17985000 ] Houston Putman commented on SOLR-17158: --- Hey all, I think we need to rethink the synchronization here, since this change has caused massive performance issues when using the ParallelHttpShardHandler. > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Labels: pull-request-available > Fix For: main (10.0), 9.8 > > Time Spent: 8h 50m > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17893283#comment-17893283
]
Gus Heck commented on SOLR-17158:
-
I hadn't been thinking carefully about or paying attention to partial results
at that time unfortunately. As you can see I don't agree with that direction. I
notice at the start of this you mention a concept of a configurable limit.
Perhaps adding such a parameter is the best compromise.
{{&minShardResponses=1}} can be a default for 9x and default to 0 (no error) in
10? It would be defined to have no meaning unless a partial response is
returned.
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
> Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
> Time Spent: 8h 50m
> Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891263#comment-17891263 ] Gus Heck commented on SOLR-17158: - Also adding metadata about what fraction of shards completed seems like a reasonable follow on feature, and full info about which in the debug case... but one of the things I think is difficult about failover type behavior here is that there are several types of failures: # The Limit was just too small, even a healthy server can't answer in the allotted time/space (this is a 4xx type of case if an error is to be thrown) # The query is unreasonable, and even a healthy server can't answer it in the allotted time/space (this is a 4xx type of case if an error is to be thrown) # The query and Limit are reasonable, but the (these are 5xx like cases if an error is to be thrown) ## The cluster is under extreme load and thus all shards are going to be unable to answer ## This individual node is under extreme load and an alternative node might answer. In every case except 3.2 repeating the request is harmful. The code already detects and retries if the http communication fails, but adding this timeAllowed parameter means that we can effectively hide from that retry code. If 3.1 is the usual problem that would be a good thing, and if 3.2 is most common that's not so good. In the case where zero shards responded 3.1 seems much more likely. So after pondering all this for a long time, I've come to the thought that throwing exceptions or otherwise using the response to gauge server health is a poor substitute for system monitoring. So certainly the metadata you suggest might be nice to see for troubleshooting, but I'm leery of the notion that it might be used for automated fail-over / fall-back. Also as you can see if we start throwing errors, we have no way to decide what error to throw... 4xx says "user, you must change your request" and 5xx says "come back later with that request, we've got problems"... So this is another part of why I settled on 200 OK, YOU ASKED FOR IT ;) > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Labels: pull-request-available > Fix For: main (10.0), 9.8 > > Time Spent: 8h 50m > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891261#comment-17891261
]
Gus Heck commented on SOLR-17158:
-
One of the big difficulties is the rest of the code can't handle the case where
no shard response has been received. There seemed to be multiple places where
that hasn't been considered. One prominent example is that because then
mergeIds never gets called and {{_responseDocs}} is never initialized in
ResponseBuilder... then later several bits of code try to use getResponseDocs
without a null check and blow up... I think the first one that gets hit is:
{{org.apache.solr.handler.component.QueryComponent#regularFinishStage }}where
it tries to get an iterator... I'm going to guess this or something like it is
the NPE you found when cherry picking
[https://github.com/apache/solr/pull/2493] I tried adding an initialization of
that field, but then I soon hit something else uninitialized and I became
scared of trying to also reform something as central and widely used as
ResponseBuilder at the same time. My conclusion was that it was critical in the
code as it exists to ensure that either at least one shard response is received
or srsp.setException is called. There is a test that intentionally shuts down
all shards that hits this kind of case unless recordNoUrlShardResponse() is
called. (at one point I tried to eliminate the exception creation there but
then found out it existed to later be thrown and thus prevent finishStage()
from being called on components when all shards fail...).
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
> Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
> Time Spent: 8h 50m
> Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891187#comment-17891187 ] David Smiley commented on SOLR-17158: - I sympathize with the 200 response; it's reasonable and also debatable though. If we could supply a bit of metadata in the header for the caller to know what percentage of sub-shards timed out, the caller might want to know to act on this (i.e. do fallback / failing scenario actions). I take no issue with the subShards using the full 5 seconds for this unrealistic sleep(5000,0) call. But the coordinator need not wait on the take() indefinitely; it should wait no more than the 10 milliseconds given. I think it's sad if we must wait for one sub-shard request to respond; feels needless to me; a deficiency. I suppose you might say this shouldn't even happen because the sub-shard *should* not take more than 10ms itself. I wish it didn't but where I work we see it can happen (via SOLR-17320). Hmmm; I'm not sure what action to take. > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Labels: pull-request-available > Fix For: main (10.0), 9.8 > > Time Spent: 8h 50m > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891152#comment-17891152
]
Gus Heck commented on SOLR-17158:
-
I think that test you suggest defeats the code's efforts by putting all shard
queries into a 5 second sleep? All the Limit code relies on the thread
executing the query to check itself. There's no supervisor thread, so if the
thread is forced into a sleep, it can't check itself until it wakes up. It's
not a very realistic test because typically there will be variation among
shards, and in conjunction with the searcher level checks it takes a pretty
interesting scenario for all shards to get stuck simultaneously within the
Lucene searcher code, or a loop that has no check. This ticket mostly ensures
that once one thread times out, the effort/time spent waiting on the remaining
threads is minimized. One could imagine designs for timeAllowed where the
coordinator is completely divorced from the shard processing has a
countdownLatch that waits with a timeout and simply returns whatever happens to
be available when that expires, but that's not how things currently work and
relative to CPU and Memory limits that's a special case.
As for exceptions I've tried to avoid special casing like you suggest, and
stick to a simple clear rule: One should only get an exception if the server
*refused (4xx)* or *failed(5xx)* the task you gave it. When you add a limit you
give it a task that would be phrased as: "Spend X effort searching this query
and if it takes more effort than that (do|don't) tell me the documents you
managed to find" In the case you describe where the request limit is
unreasonably short and nothing is found, the server has certainly not refused,
nor has it failed. The server will set the partial results attribute to
indicate that there might be more results. If there's any issue, it might be
the naming of "partial results" since that name sort-of sounds like there must
be some results. Perhaps a more communicative response would say
{code:java}
"resultMode":"bestEffort",
"resultStatus":"resultsIncomplete" {code}
or
{code:java}
"resultMode":"allOrNothing",
"resultStatus":"incompleteResultsOmitted"{code}
or
{code:java}
"resultMode":"bestEffort",
"resultStatus":"resultsComplete" {code}
or
{code:java}
"resultMode":"allOrNothing",
"resultStatus":"resultsComplete"{code}
However that's a much more backwards incompatible change.
I have thought it would be much nicer if there were an HTTP status code that
could be set to indicate partial results without having to parse the response,
and the closest is
[206|https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range], but
that's really meant for [Range
Requests|https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range] and
since we neither use the associated headers nor have a way to express, or even
satisfy "give me the second 30 seconds of results" it's not a good fit. We
could just co-opt our own 2xx number and document it, perhaps even propose it
(since results limited by processing constraints probably could be considered a
generic situation... maybe {{225 LIMITED RESULTS}} ?), but that may cause
problems for current users and is backwards incompatible for folks who check
for {{200 OK}} specifically.
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
> Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
> Time Spent: 8h 50m
> Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891053#comment-17891053
]
David Smiley commented on SOLR-17158:
-
Question: if all shards time out and if shards.tolerant=true, then we don't
truly have partial results to return to the user. What should we do? I argue
failing is appropriate instead of 200 despite shards.tolerant.
I created the following test method on TestDistributedSearch (where
shards.tolerant is tested):
{code:java}
public void testTimeAllowed() throws Exception {
var cluster = new MiniSolrCloudCluster.Builder(1, createTempDir())
.addConfig("conf",
configset("cloud-minimal")).configure();
CloudSolrClient client = cluster.getSolrClient();
String COLLECTION_NAME = "test_coll";
CollectionAdminRequest.createCollection(COLLECTION_NAME, "conf", 2, 1)
.setRouterName("implicit")
.setShards("shard_1,shard_2")
.process(cluster.getSolrClient());
cluster.waitForActiveCollection(COLLECTION_NAME, 2, 2); var query = new
SolrQuery();
query.setQuery("{!func}sleep(5000,0)");
query.set(CommonParams.TIME_ALLOWED, 10);
query.set(ShardParams.SHARDS_TOLERANT, "true");
var response = client.query(COLLECTION_NAME, query);
System.out.println(response);
assertTrue(response.getElapsedTime() < 1000); // definitely less than a
second
cluster.shutdown();
} {code}
And it fails, not actually timing out in the 10 milliseconds given for
timeAllowed. I expected this issue here would make this pass?
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
> Labels: pull-request-available
> Fix For: main (10.0), 9.8
>
> Time Spent: 8h 50m
> Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890108#comment-17890108 ] ASF subversion and git services commented on SOLR-17158: Commit 6dbd8d1b8e0e1690b048e32bf152837b5a341fee in solr's branch refs/heads/branch_9x from Gus Heck [ https://gitbox.apache.org/repos/asf?p=solr.git;h=6dbd8d1b8e0 ] SOLR-17158 Terminate distributed processing faster when query limit is reached and partial results are not needed (#2666) (#2773) (cherry picked from commit b748207bc331b5eeae284ee7602626dbe5e3ff50) > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Labels: pull-request-available > Time Spent: 8h 50m > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886290#comment-17886290 ] ASF subversion and git services commented on SOLR-17158: Commit b748207bc331b5eeae284ee7602626dbe5e3ff50 in solr's branch refs/heads/main from Gus Heck [ https://gitbox.apache.org/repos/asf?p=solr.git;h=b748207bc33 ] SOLR-17158 Terminate distributed processing faster when query limit is reached and partial results are not needed (#2666) > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Labels: pull-request-available > Time Spent: 8.5h > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833972#comment-17833972 ] Gus Heck commented on SOLR-17158: - Even prior to this change success/fail is a more limited concept than the actual behavior... Requests may: Fail - due to syntax, etc (400x) Fail - due to server issues (500x) Fail - due to upstream node server issues (500x) (shards.tolerant not set) Succeed, returning all results (200) Succeed, returning partial results due to any of the following cases where solr is doing what we told it to * Query Limit (timeAllowed, cpuAllowed) (200) * Upstream node server returns 500 and shards.tolerant=true (200) * Query cancellation (200) All of those rely on the partialResults attribute, and the code has logic that relies on the attribute after we no longer know why it was set, so this parameter will have the effect of preventing partial results for all of these cases. I have a specific check to return an error (400, bad request) if this attribute is set at the same time as shards.tolerant because those to notions clearly conflict. > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833944#comment-17833944 ] Andrzej Bialecki commented on SOLR-17158: - [~dsmiley] these are not exactly equivalent - when a limit is reached it doesn't have to be related in any way to per-shard processing. > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833722#comment-17833722
]
David Smiley commented on SOLR-17158:
-
Fail vs Success should be based on {{shards.tolerant}} --
https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-tolerant-parameter
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
> Time Spent: 1h 50m
> Remaining Estimate: 0h
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820781#comment-17820781
]
Andrzej Bialecki commented on SOLR-17158:
-
I'm not convinced we need a sysprop here... why shouldn't we use request
handler's {{defaults}} and {{invariants}} sections in {{solrconfig.xml}} ?
Using a sysprop effectively enforces the same default behavior for all replicas
of all collections managed by this Solr node.
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820719#comment-17820719
]
Gus Heck commented on SOLR-17158:
-
{quote}FYI, it was necessary to add this parameter in SOLR-17172, I used
{{partialResults=true}} to mean that we should stop processing and return
partial results with "success" code and "partialResults" flag in the response,
and {{partialResults=false}} to mean that we should throw an exception and
discard any partial results.
{quote}
I agree that the request should have a parameter to control this behavior as
well as providing a syprop to determine the result. I am setting it up such
that the parameter takes precedence over the syprop which only determines
default behavior if the parameter is not supplied. May be a complicated merge.
I've got things like this
{code:java}
if (thereArePartialResults && !rb.req.shouldDiscardPartials()) { {code}
That method will want to be use everywhere because it encapsulates logic for
both syprop and params.
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820030#comment-17820030
]
Andrzej Bialecki commented on SOLR-17158:
-
FYI, it was necessary to add this parameter in SOLR-17172, I used
{{partialResults=true}} to mean that we should stop processing and return
partial results with "success" code and "partialResults" flag in the response,
and {{partialResults=false}} to mean that we should throw an exception and
discard any partial results.
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[
https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819319#comment-17819319
]
Andrzej Bialecki commented on SOLR-17158:
-
Adding some observations from reading the code in {{{}SolrIndexSearcher}} and
{{HttpShardHandler}}.
It appears that currently when {{timeAllowed}} is reached it doesn’t cause
termination of all other pending shard requests. I found this section in
{{SolrIndexSearcher:284}}:
{{ try {}}
{{ super.search(query, collector);}}
{{ } catch (TimeLimitingCollector.TimeExceededException}}
{{ | ExitableDirectoryReader.ExitingReaderException}}
{{ | CancellableCollector.QueryCancelledException x) {}}
{{ log.warn("Query: [{}]; ", query, x);}}
{{ qr.setPartialResults(true);}}
In the case when it reaches {{timeAllowed}} limit (and our new {{QueryLimits}},
too) it simply sets {{partialResults=true}} and does NOT throw any exception,
so all the layers above think that the result is a success.
I suspect the reason for this was that when {{timeAllowed}} was set we still
wanted to retrieve partial results when the limit was hit, and throwing an
exception here would prevent that.
OTOH, if we had a request param saying “discard everything when you reach a
limit and cancel any ongoing requests” then we could throw an exception here,
and {{ShardHandler}} would recognize this as an error and cancel all other
shard requests that are still pending, so that replicas could avoid sending
back their results that would be discarded anyway.
> Terminate distributed processing quickly when query limit is reached
>
>
> Key: SOLR-17158
> URL: https://issues.apache.org/jira/browse/SOLR-17158
> Project: Solr
> Issue Type: Sub-task
> Components: Query Limits
>Reporter: Andrzej Bialecki
>Assignee: Gus Heck
>Priority: Major
>
> Solr should make sure that when query limits are reached and partial results
> are not needed (and not wanted) then both the processing in shards and in the
> query coordinator should be terminated as quickly as possible, and Solr
> should minimize wasted resources spent on eg. returning data from the
> remaining shards, merging responses in the coordinator, or returning any data
> back to the user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
