[
https://issues.apache.org/jira/browse/SOLR-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739731#comment-16739731
]
Gus Heck commented on SOLR-13131:
---------------------------------
h1. Functionality
h2. New Parameter Value
*router.name* would gain a new valid value of "category"
h2. New Params
This feature would need some safety valves on it to avoid collection creation
(similar in spirit to router.maxFutureMs for TRAs). To that end I suggest:
# *router.maxCardinality* to place a limit on the total number of collections
that can be created (maybe required?)
# *router.mustMatch* to provide pattern matching for valid data and reject
requests that would create an undesired collection (optional)
# {color:#707070}*router.dictionary*{color} might also be added to provide a
set of acceptable values (optional) - This may or may not be implemented as
part of this ticket.
With respect to router.dictionary, I could imagine there being a desire to have
that dictionary used as a spell checker for segments of the values. One could
break the value on _ (or something else) and make sure all the parts are
spelled properly. One could also imagine the dictionary being applied to
specific matching groups from router.mustMatch, but all of this dictionary
based checking would be a future enhancement. I'm mentioning it here to get the
idea out there for future reference.
h2. Routed Field Constraints
The data in the field to be routed will need to be constrained in a couple ways
to make this work
# The routed field would need to be single valued, and encountering multiple
values should throw an error.
# The value in the routed field must be convertible to a valid collection
name. This conversion will likely be done by replacing any invalid characters
with '_' and it is the user's responsibility to ensure that the resulting names
are unique and do not interfere with other collections in the system. Values
that resolve to an existing collection that is not part of the alias will cause
an error to be returned, the existing collection will remain unaffected and
will not become added to the alias.
h2. Validations
In addition to constraints on the values, the following validations will be
enforced at the time the CategoryRoutedAlias is created
# The *collections* attribute is not set (applies only to non-routed aliases)
# None of the TimeRoutedAlias attributes are present
# TimeRoutedAliases will also be modified to validate that
*router.maxCardinality* and *router.mustMatch* are not set
h1. Implementation
The intention here is to first convert TimeRoutedAliasUpdateProcessor to
RoutedAliasUpdateProcessor and move as much time related functionality to
TimeRoutedAlias class as possible. If necessary TimeRoutedAliasUpdateProcessor
may still remain as a (hopefully skinny) subclass of RoutedUpdateProcessor. I
also hope to extract a RoutedAlias interface from TimeRoutedAlias and that will
implemented on a new CategoryRoutedAlias class. Ideally I'd like to end up with
a RoutedAliasUpdateProcessor and two concrete RoutedAlias implementations,
though I'm not sure if that will really be possible. I'll break things down and
make individual tickets for sub parts after I play with the code a little.
Both v1 api and v2 api will be supported
h1. Documentation
# The TimeRoutedAliases page will be converted to a RoutedAliases page with
sections for TimeRoutedAliases and CategoryRoutedAliases
# The CreateAliasCommand Documentation will be updated
# The v2 api will return documentation for the new and modified attributes via
that api.
> Category Routed Aliases
> -----------------------
>
> Key: SOLR-13131
> URL: https://issues.apache.org/jira/browse/SOLR-13131
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: master (9.0)
> Reporter: Gus Heck
> Assignee: Gus Heck
> Priority: Major
>
> This ticket is to add a second type of routed alias in addition to the
> current time routed aliases. The new type of alias will allow data driven
> creation of collections based on the values of a field and automated
> organization of these collections under an alias that allows the collections
> to also be searched as a whole.
> The use case in mind at present is an IOT device type segregation, but I
> could also see this leading to the ability to direct updates to tenant
> specific hardware (in cooperation with autoscaling).
> This ticket also looks forward to (but does not include) the creation of a
> Dimensionally Routed Alias which would allow organizing time routed data also
> segregated by device
> Further design details to be added in comments.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]