[
https://issues.apache.org/jira/browse/SOLR-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Domenico Fabio Marino updated SOLR-10610:
-----------------------------------------
Description:
This patch proposes a new component that analyses a request and reports whether
it is too complex to continue processing.
This patch makes active use of SOLR-10880 , SOLR-10881 and SOLR-11343 .
This component is meant to help with troublesome queries, as it allows to
perform a query on a subset of replicas (that could be isolated from user
traffic) to verify if the query execution is safe.
Running this component is conditional and if enabled, will happen before other
components start processing.
This is ensured by the fact that the whole execution of this component happens
in the {{STAGE_PARSE_QUERY}} (SOLR-10609 proposes to rename this stage to a
more meaningful name).
CanaryComponent reports the status of the analysis as a Boolean via the
{{CanaryComponent.CANARY_SUCCESS}} response parameter and through a field in
{{ResponseBuilder}}.
The same return convention applies to both:
* {{null/non-existing}} when the CanaryComponent did not execute the query
* {{true}} if the CanaryComponent processed the query and did not find any
problem
* {{false}} if the query execution didn't terminate normally.
The CanaryComponent needs to be properly set-up before being used.
*Set-up:*
1) Tagging one or more replicas as "canary" replicas (Depends on SOLR-10880 and
SOLR-10881):
Using replica properties, set a property (this can be independent from any
other shard filtering property, but it is not compulsory for it to be such), to
a value (canary type).
A collection can have many canaries of many types, for example:
* {{shard1replica3, shard2replica1}} have the property {{canaryColour=yellow}}
* {{shard4replica2}} has the property {{canaryColour=red}}
And so on.
There can be multiple canary replicas per shard.
{{CanaryComponentTest.java}} shows an example of such tagging.
2) The CanaryComponent needs to be added to the {{/select}} RequestHandler
(Example included in the {{cloud-canary}} test config files).
Optional but encouraged: set the flag {{canary.timeout}} to a sensible Long
(time in milliseconds) value, this will ensure that all the requests have a
timeout specified.
Note: the timeout can be specified on a per-request basis.
This concludes the initial set-up.
*Usage*
For each request that needs to be run through the CanaryComponent the following
parameters have to be added (depends on SOLR-10880):
{{filterByReplicaProp=true}}
and
{{canary=CANARY_TYPE_PROPERTY:CANARY_TYPE}}
A timeout needs to be specified, but for convenience it can be specified as
mentioned in point 2.
Running a query on the canary without a timeout is not permitted, and an
exception will be thrown.
For example:
{code:java}filterByReplicaProp=true&canary=birdColour:yellow&canary.timeout=5000{code}
This means that the request needs the replica filtering framework enabled (See
SOLR-10880), that the canary requests will have to be routed to the replicas
having the property {{birdColour}} set to {{yellow}} and that this request
should timeout after 5 seconds.
An example of the requests can be seen in {{CanaryComponentTest.java}}
The request will only run on one canary replica, but if there are multiple
replicas matching {{CANARY_TYPE_PROPERTY:CANARY_TYPE}}, a random one will be
picked among them, should it be unreachable, another random one will be chosen,
and so on.
Example:
Given the following list of replicas matching
{{CANARY_TYPE_PROPERTY:CANARY_TYPE}} looking like {code:java}["",
"shard2replica2", "", "shard4replica3|shard4replica2"]{code}
The component will rearrange them to look like this list
{code:java}["shard2replica2|shard4replica3|shard4replica2"]{code}
The request will be executed exactly how the QueryComponent would execute it
(depends on SOLR-11343), this is done to ensure that the analysis is as
realistic as possible, however its execution will be performed in a separate
thread.
This is done so that eventual exceptions thrown by the query can be caught, and
that its time of execution can be monitored at a finer level, the execution of
the query is halted as soon as an exception is detected or if it timed out.
CanaryComponent will clean its query results so that other components will not
see partial results.
*Usage warnings*
* Running a query with the parameter {{canary}} set, but without
{{ShardParams.FILTER_BY_REPLICA_PROPERTY}} will cause the CanaryComponent to
throw an exception.
* Running a query with the parameter {{canary}} set, but without any replica
matching the property and tag specified will cause the CanaryComponent to throw
an exception.
* Running a query with the parameter {{canary}} set to a value not matching the
format {{CANARY_TYPE_PROPERTY:CANARY_TYPE}} will cause the CanaryComponent to
throw an exception.
was:
This patch outlines a new component that analyses a request and reports whether
it is too complex to continue processing.
Running this component should be conditional and happen before other components
start processing. The component will set a status flag so that other components
can know the result of the Canary check, and also adds some information to the
response sent back to the client.
The component runs the query on the set of replicas that are tagged with the
"canary" tag.
Please note this is only an outline so far and it therefore lacks test cases.
When this will be more feature-complete, a test case will be added
> Add CanaryComponent, a search component to analyse requests
> -----------------------------------------------------------
>
> Key: SOLR-10610
> URL: https://issues.apache.org/jira/browse/SOLR-10610
> Project: Solr
> Issue Type: New Feature
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Domenico Fabio Marino
> Priority: Minor
> Attachments: SOLR-10610.patch, SOLR-10610.patch
>
>
> This patch proposes a new component that analyses a request and reports
> whether it is too complex to continue processing.
> This patch makes active use of SOLR-10880 , SOLR-10881 and SOLR-11343 .
> This component is meant to help with troublesome queries, as it allows to
> perform a query on a subset of replicas (that could be isolated from user
> traffic) to verify if the query execution is safe.
> Running this component is conditional and if enabled, will happen before
> other components start processing.
> This is ensured by the fact that the whole execution of this component
> happens in the {{STAGE_PARSE_QUERY}} (SOLR-10609 proposes to rename this
> stage to a more meaningful name).
> CanaryComponent reports the status of the analysis as a Boolean via the
> {{CanaryComponent.CANARY_SUCCESS}} response parameter and through a field in
> {{ResponseBuilder}}.
> The same return convention applies to both:
> * {{null/non-existing}} when the CanaryComponent did not execute the query
> * {{true}} if the CanaryComponent processed the query and did not find any
> problem
> * {{false}} if the query execution didn't terminate normally.
> The CanaryComponent needs to be properly set-up before being used.
> *Set-up:*
> 1) Tagging one or more replicas as "canary" replicas (Depends on SOLR-10880
> and SOLR-10881):
> Using replica properties, set a property (this can be independent from any
> other shard filtering property, but it is not compulsory for it to be such),
> to a value (canary type).
> A collection can have many canaries of many types, for example:
> * {{shard1replica3, shard2replica1}} have the property
> {{canaryColour=yellow}}
> * {{shard4replica2}} has the property {{canaryColour=red}}
> And so on.
> There can be multiple canary replicas per shard.
> {{CanaryComponentTest.java}} shows an example of such tagging.
> 2) The CanaryComponent needs to be added to the {{/select}} RequestHandler
> (Example included in the {{cloud-canary}} test config files).
> Optional but encouraged: set the flag {{canary.timeout}} to a sensible Long
> (time in milliseconds) value, this will ensure that all the requests have a
> timeout specified.
> Note: the timeout can be specified on a per-request basis.
> This concludes the initial set-up.
> *Usage*
> For each request that needs to be run through the CanaryComponent the
> following parameters have to be added (depends on SOLR-10880):
> {{filterByReplicaProp=true}}
> and
> {{canary=CANARY_TYPE_PROPERTY:CANARY_TYPE}}
> A timeout needs to be specified, but for convenience it can be specified as
> mentioned in point 2.
> Running a query on the canary without a timeout is not permitted, and an
> exception will be thrown.
> For example:
> {code:java}filterByReplicaProp=true&canary=birdColour:yellow&canary.timeout=5000{code}
> This means that the request needs the replica filtering framework enabled
> (See SOLR-10880), that the canary requests will have to be routed to the
> replicas having the property {{birdColour}} set to {{yellow}} and that this
> request should timeout after 5 seconds.
> An example of the requests can be seen in {{CanaryComponentTest.java}}
> The request will only run on one canary replica, but if there are multiple
> replicas matching {{CANARY_TYPE_PROPERTY:CANARY_TYPE}}, a random one will be
> picked among them, should it be unreachable, another random one will be
> chosen, and so on.
> Example:
> Given the following list of replicas matching
> {{CANARY_TYPE_PROPERTY:CANARY_TYPE}} looking like {code:java}["",
> "shard2replica2", "", "shard4replica3|shard4replica2"]{code}
> The component will rearrange them to look like this list
> {code:java}["shard2replica2|shard4replica3|shard4replica2"]{code}
> The request will be executed exactly how the QueryComponent would execute it
> (depends on SOLR-11343), this is done to ensure that the analysis is as
> realistic as possible, however its execution will be performed in a separate
> thread.
> This is done so that eventual exceptions thrown by the query can be caught,
> and that its time of execution can be monitored at a finer level, the
> execution of the query is halted as soon as an exception is detected or if it
> timed out.
> CanaryComponent will clean its query results so that other components will
> not see partial results.
> *Usage warnings*
> * Running a query with the parameter {{canary}} set, but without
> {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} will cause the CanaryComponent to
> throw an exception.
> * Running a query with the parameter {{canary}} set, but without any replica
> matching the property and tag specified will cause the CanaryComponent to
> throw an exception.
> * Running a query with the parameter {{canary}} set to a value not matching
> the format {{CANARY_TYPE_PROPERTY:CANARY_TYPE}} will cause the
> CanaryComponent to throw an exception.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]