[ 
https://issues.apache.org/jira/browse/SOLR-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-10880:
---------------------------------------
    Attachment: SOLR-10880.patch

Hi Domenico,

Thanks for continuing to work on this by updating the patch!

Having not looked at the new code in a while, I somewhat struggled to find my 
way back into it: the new replica filtering logic is flexible and powerful but 
because of that its code is naturally quite complex too.

Attached revised patch
* tweaks mainly the TagReplicaMapper code to try and make it more accessible, 
also added some more comments
* starts a section for the new params in the [Distributed 
Requests|https://lucene.apache.org/solr/guide/6_6/distributed-requests.html] 
section of the Solr Reference Guide
* adds couple of TODO/question comments (no biggies, just bits that jumped out 
when looking at the code again after a while)

*New parameter syntax question*
How about turning
{code}
shards.filter=replicaProp.continent:europe
{code}
into
{code}
shards.filter=replicaProp:continent=europe
{code}
to match the potential future
{code}
shards.filter=replicaType:NRT
{code}
and/or
{code}
shards.filterNot=nodeRole:overseer
{code}
style?

> Support replica filtering by tag
> --------------------------------
>
>                 Key: SOLR-10880
>                 URL: https://issues.apache.org/jira/browse/SOLR-10880
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Domenico Fabio Marino
>         Attachments: SOLR-10880.patch, SOLR-10880.patch, SOLR-10880.patch, 
> SOLR-10880.patch, SOLR-10880.patch, SOLR-10880.patch
>
>
> Add a mechanism to allow queries to use only a subset of replicas(by 
> specifying the wanted replica tag).
> Replicas have to be marked with tags before running the query.
> *Setup needed from the replica side*
> Set the required properties to the required values in at least one replica.
> ----
> *Setup needed from the query side*
> A query has to specify {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} to specify 
> that it is indeed interested in replica property filtering.
> Then it should specify {{ShardParams.SHARDS_FILTER}} or 
> {{ShardParams.SHARDS_FILTERNOT}} set to {{ShardParams.REPLICA_PROP}} followed 
> by the property that has to be checked followed by ":" and then the value 
> wanted.
> Excample:
> Given that some replicas have a property named {{region}}:
> Adding the following params to the query:
> {{filterByReplicaProp=true&shards.filter=replicaProp.region:EMEA}}
> will ensure that the query uses replicas that have the property {{region}} 
> set to {{EMEA}}
> {{filterByReplicaProp=true&shards.filterNot=replicaProp.region:EMEA}}
> will ensure that the query *does not* use replica that have the property 
> {{region}} set to {{EMEA}} 
> ----
> An example can be seen in the {{ReplicaTagTest}} included in this patch where 
> a dynamic cloud has some tags assigned to it both randomly and on a fixed 
> basis.
> A replica can have multiple tags attached to it, and these tags are separated 
> by default by "|"(pipe character), the delimiter can be changed by setting 
> {{ShardParams.REPLICA_TAG_DELIMITER}} in the query to anything else.
> The {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} is needed because the 
> computation required to filter by property:value is quite complex and queries 
> that don't care about replica filtering should not incur into the performance 
> penalty.
> The {{ShardParams.REPLICA_PROP}} (currently set to {{replicaProp.}} is needed 
> to ensure that the system is extensible in the future.
> *Usage warnings*
> Using {{ShardParams.SHARDS_FILTER}} or {{ShardParams.SHARDS_FILTERNOT}} set 
> to {{ShardParams.REPLICA_PROP}} without 
> {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} will cause the {{QueryComponent}} 
> to throw exceptions.
> Using {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} without filters will not 
> cause any error, but will likely waste computation time.
> No validity check is performed on the tags, therefore one may get an array of 
> shard URLs that contains empty URLs, or that is null(when the property does 
> not exist), the user of this feature has to deal with it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to