[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883163#comment-16883163 ] Gus Heck commented on SOLR-13375: - Patch with tests and fixes to make the tests pass. Things yet to do: * Documentation * Review the javadocs and comments in code many of which may have become obsolete After that I'll look to commit this to master and 8x for inclusion in 8.3 > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Attachments: SOLR-13375.patch, SOLR-13375.patch, SOLR-13375.patch, > SOLR-13375.patch > > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to support two levels, though if arbitrary levels can be > supported easily that will be done. > This could also have been called CompositeRoutedAlias, but that creates a TLA > clash with CategoryRoutedAlias. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880918#comment-16880918 ] David Smiley commented on SOLR-13375: - Fascinating bug to track down; congrats on that! I hope it might help some other tests to be less flakey. > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Attachments: SOLR-13375.patch, SOLR-13375.patch, SOLR-13375.patch > > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to support two levels, though if arbitrary levels can be > supported easily that will be done. > This could also have been called CompositeRoutedAlias, but that creates a TLA > clash with CategoryRoutedAlias. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878979#comment-16878979 ] Gus Heck commented on SOLR-13375: - First potentially functional patch. Lots of refactoring to move logic into the routed alias classes from the maintain Cmd classes, which are now consolidated to a single MaintainRoutedAlias class. Also refactored many longer methods into smaller chunks. The basic strategy here is to use a specialized subclass of the primary routed alias classes to provide the context for answering the question of whether or not collections need to be created. I also altered the basic logic such that the routed aliases fully calculate the collection to which the document should be routed and then push that "target" collection to the maintain command repeatedly until the collection required has been created. This simplifies the situation which previously had MaintainCategoryRoutedAliasCmd working off of the value encountered in the document and the MaintainTimeRoutedAlias was just marching the collections forward without knowledge of the end-state. The imbalance in those strategies needed to be resolved to keep DRA's tractable. Another notable abstraction added is a notion of "actions" that are requested by each routed alias during the execution of the MaintainRoutedAliasCmd. In the case of DRA's these are generated by each sub-dimension and collated into a final set of actions. by the DRA. Additionally I hit a very time consuming bug with tests, where I eventually realized that the problem is that the results of an admin command become visible (to the test) before the execution of the command is entirely completed, and the test that has waited for a collection to be visible can begin to shut down while an Async operation is still in progress. This can lead to never being released from the watcher.await(timeout); call in OverseerTaskQueue.offer never releasing (and then the shutdown cycle that is waiting for the core async thread to terminate waits until the timeout expires). This only showed up if I saturated my CPU and then only ablut 20% of the time. The sneaky thing about this is if you beasted it and went to bed or went to lunch it would complete successfully because of the timeout, but the time it took to do so was ridiculous if you were waiting for it. A 5 second Thread.sleep() as the last line of the test reliably resolved this, but not being happy with that I added a count of pending overseerTasks and a allowOverseerPendingTasksToComplete() method to OverseerTaskQueue and the first thing that happens on CoreContainer.shutdown is it calls the new method (which of course first prohibits new tasks from being queued... though I'm not sure if the exception thrown to threads that try is ideal...). Once the in-progress tasks finish shutdown proceeds normally. This completely solved the problems with my async collection creation tests. [^SOLR-13375.patch] > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Attachments: SOLR-13375.patch, SOLR-13375.patch, SOLR-13375.patch > > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to support two levels, though if arbitrary levels can be > supported easily that will be done. > This could also have been called CompositeRoutedAlias, but that creates a TLA > clash with CategoryRoutedAlias. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands,
[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875345#comment-16875345 ] Gus Heck commented on SOLR-13375: - Another WIP patch, now solving the v2 api issue via an implementation of toMap() on the SolrParams anon wrapper Both work, and all tests pass (but the DRA they create still isn't functional, that's next). > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Attachments: SOLR-13375.patch, SOLR-13375.patch > > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to support two levels, though if arbitrary levels can be > supported easily that will be done. > This could also have been called CompositeRoutedAlias, but that creates a TLA > clash with CategoryRoutedAlias. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872310#comment-16872310 ] Gus Heck commented on SOLR-13375: - Actually looking again this morning I realize that the writeParams method is a bit of a red herring. It's really the conflict between the getParams() method api and the objects I want to express. The real problem is that the v2 API (as I am attempting to use it) wants to be able to handle more complex objects than SolrParams really was intended for. SolrParams *is* documented as being string to one or more strings, but that makes it hard to handle json that has properties that are lists of objects (lists of strings clearly work). Autoscaling seems to be using lists of object for set-trigger.actions but AFAICT they don't have a v1 api and I suspect they therefore dodge this SolrParams.toMap()/getParam() issue. One possible way around this might be to override toMap() in the wrapper to just return the map that backs the wrapper, but that has to still somehow do the conversions to v1 api keys before returning the map, and it widens the actual capabilities of SolrParams beyond it's documentation which could trip up other folks. > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Attachments: SOLR-13375.patch > > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to support two levels, though if arbitrary levels can be > supported easily that will be done. > This could also have been called CompositeRoutedAlias, but that creates a TLA > clash with CategoryRoutedAlias. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872301#comment-16872301 ] Gus Heck commented on SOLR-13375: - org.apache.solr.handler.admin.BaseHandlerApiSupport#wrapParams > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Attachments: SOLR-13375.patch > > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to support two levels, though if arbitrary levels can be > supported easily that will be done. > This could also have been called CompositeRoutedAlias, but that creates a TLA > clash with CategoryRoutedAlias. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871982#comment-16871982 ] David Smiley commented on SOLR-13375: - Which "a SolrParams wrapper instance" is this? > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Attachments: SOLR-13375.patch > > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to support two levels, though if arbitrary levels can be > supported easily that will be done. > This could also have been called CompositeRoutedAlias, but that creates a TLA > clash with CategoryRoutedAlias. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871936#comment-16871936 ] Gus Heck commented on SOLR-13375: - Attaching patch with WIP initial concept of the V1 api for this working to the point of creating an alias that thinks it's a DRA (but fails with a not yet implemented exception message if you try to send it data) API looks like this: {code:java} http://localhost:8983/solr/admin/collections?action=CREATEALIAS =dra_test =Dimensional[category,time] =myCategory_s =20 =myDate_tdt =2019-01-01T00:00:00Z/MONTH =%2B1MONTH =60 =_default =2 {code} This is not mapped into the V2 API yet because although I want to do this: {code:java} "routerList": { "type": "array", "description": "A list of router property sets to be used with router type Dimensional[foo,bar] where foo and bar are valid router type names (i.e. time or category). The order must correspond to the type specification in [] in the Dimensional type, so Dimensional[category,time] would require the first set of router properties to be valid for a category routed alias, and the second set to be valid for a time routed alias. In these sets of properties, router.name will be ignored in favor of the type specified in the top level Dimensional[] router.name", "items": { "type": "object", "additionalProperties": true } } {code} enabling this v2 api JSON: {code:java} { "create-alias":{ "name":"dra_test2", "router": { "name": "Dimensional[category,time]", "routerList" : [{ "field":"myCategory_s", "maxCardinality":20 }, { "field":"myDate_tdt", "start":"2019-01-01T00:00:00Z", "interval":"+1MONTH", "maxFutureMs":60 }] }, "create-collection": { "collection.configName":"_default", "numShards":2 } } } {code} this todo/assumption from SOLR-11913 is getting in the way ([~dsmiley]): {code:java} public void writeMap(EntryWriter ew) throws IOException { //TODO don't call toNamedList; more efficiently implement here //note: multiple values, if present, are a String[] under 1 key toNamedList().forEach((k, v) -> { {code} And throwing: {code:java} "error": { "metadata": [ "error-class", "org.apache.solr.common.SolrException", "root-error-class", "java.lang.ArrayStoreException" ], "msg": "java.lang.ArrayStoreException: arraycopy: element type mismatch: can not cast one of the elements of java.lang.Object[] to the type of the destination array, java.lang.String", "code": 400 } {code} The reason for this is that my configuration results in a SolrParams wrapper instance that has non-string (List) values in the map variable ( which carried along by the lambda as backing for getParams(), which returns String[] and uses List.toArray() with a String array parameter)... I may be able to work around this by not using toMap() but that's probably going to be messier > Dimensional Routed Aliases > -- > > Key: SOLR-13375 > URL: https://issues.apache.org/jira/browse/SOLR-13375 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: master (9.0) >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > > Current available routed aliases are restricted to a single field. This > feature will allow Solr to provide data driven collection access, creation > and management based on multiple fields in a document. The collections will > be queried and updated in a unified manner via an alias. Current routing is > restricted to the values of a single field. The particularly useful > combination at this time will be Category X Time routing but Category X > Category may also be useful. More importantly, if additional routing schemes > are created in the future (either as contributions or as custom code by > users) combination among these should be supported. > It is expected that not all combinations will be useful, and that > determination of usefulness I expect to leave up to the user. Some Routing > schemes may need to be limited to be the leaf/last routing scheme for > technical reasons, though I'm not entirely convinced of that yet. If so, a > flag will be added to the RoutedAlias interface. > Initial desire is to