[
https://issues.apache.org/jira/browse/SOLR-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christine Poerschke updated SOLR-10880:
---------------------------------------
Attachment: SOLR-10880.patch
Hi Domenico,
Thanks for continuing to work on this by updating the patch!
Having not looked at the new code in a while, I somewhat struggled to find my
way back into it: the new replica filtering logic is flexible and powerful but
because of that its code is naturally quite complex too.
Attached revised patch
* tweaks mainly the TagReplicaMapper code to try and make it more accessible,
also added some more comments
* starts a section for the new params in the [Distributed
Requests|https://lucene.apache.org/solr/guide/6_6/distributed-requests.html]
section of the Solr Reference Guide
* adds couple of TODO/question comments (no biggies, just bits that jumped out
when looking at the code again after a while)
*New parameter syntax question*
How about turning
{code}
shards.filter=replicaProp.continent:europe
{code}
into
{code}
shards.filter=replicaProp:continent=europe
{code}
to match the potential future
{code}
shards.filter=replicaType:NRT
{code}
and/or
{code}
shards.filterNot=nodeRole:overseer
{code}
style?
> Support replica filtering by tag
> --------------------------------
>
> Key: SOLR-10880
> URL: https://issues.apache.org/jira/browse/SOLR-10880
> Project: Solr
> Issue Type: New Feature
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Domenico Fabio Marino
> Attachments: SOLR-10880.patch, SOLR-10880.patch, SOLR-10880.patch,
> SOLR-10880.patch, SOLR-10880.patch, SOLR-10880.patch
>
>
> Add a mechanism to allow queries to use only a subset of replicas(by
> specifying the wanted replica tag).
> Replicas have to be marked with tags before running the query.
> *Setup needed from the replica side*
> Set the required properties to the required values in at least one replica.
> ----
> *Setup needed from the query side*
> A query has to specify {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} to specify
> that it is indeed interested in replica property filtering.
> Then it should specify {{ShardParams.SHARDS_FILTER}} or
> {{ShardParams.SHARDS_FILTERNOT}} set to {{ShardParams.REPLICA_PROP}} followed
> by the property that has to be checked followed by ":" and then the value
> wanted.
> Excample:
> Given that some replicas have a property named {{region}}:
> Adding the following params to the query:
> {{filterByReplicaProp=true&shards.filter=replicaProp.region:EMEA}}
> will ensure that the query uses replicas that have the property {{region}}
> set to {{EMEA}}
> {{filterByReplicaProp=true&shards.filterNot=replicaProp.region:EMEA}}
> will ensure that the query *does not* use replica that have the property
> {{region}} set to {{EMEA}}
> ----
> An example can be seen in the {{ReplicaTagTest}} included in this patch where
> a dynamic cloud has some tags assigned to it both randomly and on a fixed
> basis.
> A replica can have multiple tags attached to it, and these tags are separated
> by default by "|"(pipe character), the delimiter can be changed by setting
> {{ShardParams.REPLICA_TAG_DELIMITER}} in the query to anything else.
> The {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} is needed because the
> computation required to filter by property:value is quite complex and queries
> that don't care about replica filtering should not incur into the performance
> penalty.
> The {{ShardParams.REPLICA_PROP}} (currently set to {{replicaProp.}} is needed
> to ensure that the system is extensible in the future.
> *Usage warnings*
> Using {{ShardParams.SHARDS_FILTER}} or {{ShardParams.SHARDS_FILTERNOT}} set
> to {{ShardParams.REPLICA_PROP}} without
> {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} will cause the {{QueryComponent}}
> to throw exceptions.
> Using {{ShardParams.FILTER_BY_REPLICA_PROPERTY}} without filters will not
> cause any error, but will likely waste computation time.
> No validity check is performed on the tags, therefore one may get an array of
> shard URLs that contains empty URLs, or that is null(when the property does
> not exist), the user of this feature has to deal with it.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]