[ 
https://issues.apache.org/jira/browse/SOLR-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Domenico Fabio Marino updated SOLR-10610:
-----------------------------------------
    Attachment: SOLR-10610.patch

Updated patch to lessen the changes in QueryComponent.java

> Add CanaryComponent, a search component to analyse requests
> -----------------------------------------------------------
>
>                 Key: SOLR-10610
>                 URL: https://issues.apache.org/jira/browse/SOLR-10610
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Domenico Fabio Marino
>            Priority: Minor
>         Attachments: SOLR-10610.patch, SOLR-10610.patch, SOLR-10610.patch
>
>
> This patch proposes a new component that analyses a request and reports 
> whether it is too complex to continue processing.
> This patch makes active use of SOLR-10880 , SOLR-10881 and SOLR-11343 .
> This component is meant to help with troublesome queries, as it allows to 
> perform a query on a subset of replicas (that could be isolated from user 
> traffic) to verify if the query execution is safe.
> Running this component is conditional and if enabled, will happen before 
> other components start processing.
> This is ensured by the fact that the whole execution of this component 
> happens in the {{STAGE_PARSE_QUERY}} (SOLR-10609 proposes to rename this 
> stage to a more meaningful name).
> CanaryComponent reports the status of the analysis as a Boolean via the 
> {{CanaryComponent.CANARY_SUCCESS}} response parameter and through a field in 
> {{ResponseBuilder}}.
> The same return convention applies to both:
> * {{null/non-existing}} when the CanaryComponent did not execute the query
> * {{true}} if the CanaryComponent processed the query and did not find any 
> problem
> * {{false}} if the query execution didn't terminate normally.
> The CanaryComponent needs to be properly set-up before being used.
> *Set-up:*
> 1) Tagging one or more replicas as "canary" replicas (Depends on SOLR-10880 
> and SOLR-10881):
>   Using replica properties, set a property (this can be independent from any 
> other shard filtering property, but it is not compulsory for it to be such), 
> to a value (canary type).
>   A collection can have many canaries of many types, for example:
>  *  {{shard1replica3, shard2replica1}} have the property 
> {{canaryColour=yellow}}
>  *  {{shard4replica2}} has the property {{canaryColour=red}}
>   And so on.
>   There can be multiple canary replicas per shard.
>   {{CanaryComponentTest.java}} shows an example of such tagging.
> 2) The CanaryComponent needs to be added to the {{/select}} RequestHandler 
> (Example included in the {{cloud-canary}} test config files).
>   Optional but encouraged: set the flag {{canary.timeout}} to a sensible Long 
> (time in milliseconds) value, this will ensure that all the requests have a 
> timeout specified.
>   Note: the timeout can be specified on a per-request basis.
> This concludes the initial set-up.
> *Usage*
> For each request that needs to be run through the CanaryComponent the 
> following parameters have to be added (depends on SOLR-10880):
> {{filterByReplicaProp=true}}
> and
> {{canary=CANARY_TYPE_PROPERTY:CANARY_TYPE}}
> A timeout needs to be specified, but for convenience it can be specified as 
> mentioned in point 2.
> Running a query on the canary without a timeout is not permitted, and an 
> exception will be thrown.
> For example: 
> {code:java}filterByReplicaProp=true&canary=birdColour:yellow&canary.timeout=5000{code}
> This means that the request needs the replica filtering framework enabled 
> (See SOLR-10880), that the canary requests will have to be routed to the 
> replicas having the property {{birdColour}} set to {{yellow}} and that this 
> request should timeout after 5 seconds.
> An example of the requests can be seen in {{CanaryComponentTest.java}}
> The request will only run on one canary replica, but if there are multiple 
> replicas matching {{CANARY_TYPE_PROPERTY:CANARY_TYPE}}, a random one will be 
> picked among them, should it be unreachable, another random one will be 
> chosen, and so on.
> Example:
> Given the following list of replicas matching 
> {{CANARY_TYPE_PROPERTY:CANARY_TYPE}} looking like {code:java}["", 
> "shard2replica2", "", "shard4replica3|shard4replica2"]{code}
> The component will rearrange them to look like this list 
> {code:java}["shard2replica2|shard4replica3|shard4replica2"]{code}
> The request will be executed exactly how the QueryComponent would execute it 
> (depends on SOLR-11343), this is done to ensure that the analysis is as 
> realistic as possible, however its execution will be performed in a separate 
> thread.
> This is done so that eventual exceptions thrown by the query can be caught, 
> and that its time of execution can be monitored at a finer level, the 
> execution of the query is halted as soon as an exception is detected or if it 
> timed out.
> CanaryComponent will clean its query results so that other components will 
> not see partial results.
> *Usage warnings*
> * Running a query with the parameter {{canary}} set, but without 
> {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} will cause the CanaryComponent to 
> throw an exception.
> * Running a query with the parameter {{canary}} set, but without any replica 
> matching the property and tag specified will cause the CanaryComponent to 
> throw an exception.
> * Running a query with the parameter {{canary}} set to a value not matching 
> the format {{CANARY_TYPE_PROPERTY:CANARY_TYPE}} will cause the 
> CanaryComponent to throw an exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to