[ 
https://issues.apache.org/jira/browse/SOLR-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739731#comment-16739731
 ] 

Gus Heck commented on SOLR-13131:
---------------------------------

h1. Functionality
h2. New Parameter Value

*router.name* would gain a new valid value of "category"
h2. New Params

This feature would need some safety valves on it to avoid collection creation 
(similar in spirit to router.maxFutureMs for TRAs). To that end I suggest:
 # *router.maxCardinality* to place a limit on the total number of collections 
that can be created (maybe required?)
 # *router.mustMatch* to provide pattern matching for valid data and reject 
requests that would create an undesired collection (optional)
 # {color:#707070}*router.dictionary*{color}  might also be added to provide a 
set of acceptable values (optional) - This may or may not be implemented as 
part of this ticket.

With respect to router.dictionary, I could imagine there being a desire to have 
that dictionary used as a spell checker for segments of the values. One could 
break the value on _ (or something else) and make sure all the parts are 
spelled properly. One could also imagine the dictionary being applied to 
specific matching groups from router.mustMatch, but all of this dictionary 
based checking would be a future enhancement. I'm mentioning it here to get the 
idea out there for future reference.
h2. Routed Field Constraints

The data in the field to be routed will need to be constrained in a couple ways 
to make this work
 # The routed field would need to be single valued, and encountering multiple 
values should throw an error.
 # The value in the routed field must be convertible to a valid collection 
name. This conversion will likely be done by replacing any invalid characters 
with '_' and it is the user's responsibility to ensure that the resulting names 
are unique and do not interfere with other collections in the system. Values 
that resolve to an existing collection that is not part of the alias will cause 
an error to be returned, the existing collection will remain unaffected and 
will not become added to the alias.

h2. Validations

In addition to constraints on the values, the following validations will be 
enforced at the time the CategoryRoutedAlias is created
 # The *collections* attribute is not set (applies only to non-routed aliases)
 # None of the TimeRoutedAlias attributes are present
 # TimeRoutedAliases will also be modified to validate that 
*router.maxCardinality* and *router.mustMatch* are not set

h1. Implementation

The intention here is to first convert TimeRoutedAliasUpdateProcessor to 
RoutedAliasUpdateProcessor and move as much time related functionality to 
TimeRoutedAlias class as possible. If necessary TimeRoutedAliasUpdateProcessor 
may still remain as a (hopefully skinny) subclass of RoutedUpdateProcessor. I 
also hope to extract a RoutedAlias interface from TimeRoutedAlias and that will 
implemented on a new CategoryRoutedAlias class. Ideally I'd like to end up with 
a RoutedAliasUpdateProcessor and two concrete RoutedAlias implementations, 
though I'm not sure if that will really be possible. I'll break things down and 
make individual tickets for sub parts after I play with the code a little.

Both v1 api and v2 api will be supported
h1. Documentation
 # The TimeRoutedAliases page will be converted to a RoutedAliases page with 
sections for TimeRoutedAliases and CategoryRoutedAliases
 # The CreateAliasCommand Documentation will be updated
 # The v2 api will return documentation for the new and modified attributes via 
that api.

 

> Category Routed Aliases
> -----------------------
>
>                 Key: SOLR-13131
>                 URL: https://issues.apache.org/jira/browse/SOLR-13131
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: master (9.0)
>            Reporter: Gus Heck
>            Assignee: Gus Heck
>            Priority: Major
>
> This ticket is to add a second type of routed alias in addition to the 
> current time routed aliases. The new type of alias will allow data driven 
> creation of collections based on the values of a field and automated 
> organization of these collections under an alias that allows the collections 
> to also be searched as a whole.
> The use case in mind at present is an IOT device type segregation, but I 
> could also see this leading to the ability to direct updates to tenant 
> specific hardware (in cooperation with autoscaling). 
> This ticket also looks forward to (but does not include) the creation of a 
> Dimensionally Routed Alias which would allow organizing time routed data also 
> segregated by device
> Further design details to be added in comments.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to