[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-07-11 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883163#comment-16883163
 ] 

Gus Heck commented on SOLR-13375:
-

Patch with tests and fixes to make the tests pass. Things yet to do:
 * Documentation
 * Review the javadocs and comments in code many of which may have become 
obsolete

After that I'll look to commit this to master and 8x for inclusion in 8.3

 

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13375.patch, SOLR-13375.patch, SOLR-13375.patch, 
> SOLR-13375.patch
>
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to support two levels, though if arbitrary levels can be 
> supported easily that will be done.
> This could also have been called CompositeRoutedAlias, but that creates a TLA 
> clash with CategoryRoutedAlias.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-07-08 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880918#comment-16880918
 ] 

David Smiley commented on SOLR-13375:
-

Fascinating bug to track down; congrats on that!  I hope it might help some 
other tests to be less flakey.

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13375.patch, SOLR-13375.patch, SOLR-13375.patch
>
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to support two levels, though if arbitrary levels can be 
> supported easily that will be done.
> This could also have been called CompositeRoutedAlias, but that creates a TLA 
> clash with CategoryRoutedAlias.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-07-04 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878979#comment-16878979
 ] 

Gus Heck commented on SOLR-13375:
-

First potentially functional patch. Lots of refactoring to move logic into the 
routed alias classes from the maintain Cmd classes, which are now consolidated 
to a single MaintainRoutedAlias class. Also refactored many longer methods into 
smaller chunks. The basic strategy here is to use a specialized subclass of the 
primary routed alias classes to provide the context for answering the question 
of whether or not collections need to be created. I also altered the basic 
logic such that the routed aliases fully calculate the collection to which the 
document should be routed and then push that "target" collection to the 
maintain command repeatedly until the collection required has been created. 
This simplifies the situation which previously had 
MaintainCategoryRoutedAliasCmd working off of the value encountered in the 
document and the MaintainTimeRoutedAlias was just marching the collections 
forward without knowledge of the end-state. The imbalance in those strategies 
needed to be resolved to keep DRA's tractable. Another notable abstraction 
added is a notion of "actions" that are requested by each routed alias during 
the execution of the MaintainRoutedAliasCmd. In the case of DRA's these are 
generated by each sub-dimension and collated into a final set of actions. by 
the DRA. 

Additionally I hit a very time consuming bug with tests, where I eventually 
realized that the problem is that the results of an admin command become 
visible (to the test) before the execution of the command is entirely 
completed, and the test that has waited for a collection to be visible can 
begin to shut down while an Async operation is still in progress. This can lead 
to never being released from the watcher.await(timeout); call in 
OverseerTaskQueue.offer never releasing (and then the shutdown cycle that is 
waiting for the core async thread to terminate waits until the timeout 
expires). This only showed up if I saturated my CPU and then only ablut 20% of 
the time. The sneaky thing about this is if you beasted it and went to bed or 
went to lunch it would complete successfully because of the timeout, but the 
time it took to do so was ridiculous if you were waiting for it. 

A 5 second Thread.sleep() as the last line of the test reliably resolved this, 
but not being happy with that  I added a count of pending overseerTasks and a 
allowOverseerPendingTasksToComplete() method to OverseerTaskQueue and the first 
thing that happens on CoreContainer.shutdown is it calls the new method (which 
of course first prohibits new tasks from being queued... though I'm not sure if 
the exception thrown to threads that try is ideal...). Once the in-progress 
tasks finish shutdown proceeds normally.  This completely solved the problems 
with my async collection creation tests.  [^SOLR-13375.patch] 

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13375.patch, SOLR-13375.patch, SOLR-13375.patch
>
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to support two levels, though if arbitrary levels can be 
> supported easily that will be done.
> This could also have been called CompositeRoutedAlias, but that creates a TLA 
> clash with CategoryRoutedAlias.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, 

[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-06-28 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875345#comment-16875345
 ] 

Gus Heck commented on SOLR-13375:
-

Another WIP patch, now solving the v2 api issue via an implementation of 
toMap() on the SolrParams anon wrapper Both work, and all tests pass (but the 
DRA they create still isn't functional, that's next). 

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13375.patch, SOLR-13375.patch
>
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to support two levels, though if arbitrary levels can be 
> supported easily that will be done.
> This could also have been called CompositeRoutedAlias, but that creates a TLA 
> clash with CategoryRoutedAlias.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-06-25 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872310#comment-16872310
 ] 

Gus Heck commented on SOLR-13375:
-

Actually looking again this morning I realize that the writeParams method is a 
bit of a red herring. It's really the conflict between the getParams() method 
api and the objects I want to express. The real problem is that the v2 API (as 
I am attempting to use it) wants to be able to handle more complex objects than 
SolrParams really was intended for. SolrParams *is* documented as being string 
to one or more strings, but that makes it hard to handle json that has 
properties that are lists of objects (lists of strings clearly work). 
Autoscaling seems to be using lists of object for set-trigger.actions but 
AFAICT they don't have a v1 api and I suspect they therefore dodge this 
SolrParams.toMap()/getParam() issue. One possible way around this might be to 
override toMap() in the wrapper to just return the map that backs the wrapper, 
but that has to still somehow do the conversions to v1 api keys before 
returning the map, and it widens the actual capabilities of SolrParams beyond 
it's documentation which could trip up other folks.

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13375.patch
>
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to support two levels, though if arbitrary levels can be 
> supported easily that will be done.
> This could also have been called CompositeRoutedAlias, but that creates a TLA 
> clash with CategoryRoutedAlias.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-06-25 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872301#comment-16872301
 ] 

Gus Heck commented on SOLR-13375:
-

org.apache.solr.handler.admin.BaseHandlerApiSupport#wrapParams

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13375.patch
>
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to support two levels, though if arbitrary levels can be 
> supported easily that will be done.
> This could also have been called CompositeRoutedAlias, but that creates a TLA 
> clash with CategoryRoutedAlias.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-06-24 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871982#comment-16871982
 ] 

David Smiley commented on SOLR-13375:
-

Which "a SolrParams wrapper instance" is this?

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13375.patch
>
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to support two levels, though if arbitrary levels can be 
> supported easily that will be done.
> This could also have been called CompositeRoutedAlias, but that creates a TLA 
> clash with CategoryRoutedAlias.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13375) Dimensional Routed Aliases

2019-06-24 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871936#comment-16871936
 ] 

Gus Heck commented on SOLR-13375:
-

Attaching patch with WIP initial concept of the V1 api for this working to the 
point of creating an alias that thinks it's a DRA (but fails with a not yet 
implemented exception message if you try to send it data) API looks like this:
{code:java}
http://localhost:8983/solr/admin/collections?action=CREATEALIAS
  =dra_test
  =Dimensional[category,time]
  =myCategory_s
  =20
  =myDate_tdt
  =2019-01-01T00:00:00Z/MONTH
  =%2B1MONTH
  =60
  =_default
  =2
 {code}
This is not mapped into the V2 API yet because although I want to do this:
{code:java}
"routerList": {
  "type": "array",
  "description": "A list of router property sets to be used with 
router type Dimensional[foo,bar] where foo and bar are valid router type names 
(i.e. time or category). The order must correspond to the type specification in 
[] in the Dimensional type, so Dimensional[category,time] would require the 
first set of router properties to be valid for a category routed alias, and the 
second set to be valid for a time routed alias. In these sets of properties, 
router.name will be ignored in favor of the type specified in the top level 
Dimensional[] router.name",
  "items": {
"type": "object",
"additionalProperties": true
  }
}
 {code}
enabling this v2 api JSON:
{code:java}
{
"create-alias":{
"name":"dra_test2",
"router": {
"name": "Dimensional[category,time]",
"routerList" : [{
"field":"myCategory_s",
 "maxCardinality":20
}, {
"field":"myDate_tdt",
"start":"2019-01-01T00:00:00Z",
"interval":"+1MONTH",
"maxFutureMs":60
}]
},
"create-collection": {
"collection.configName":"_default",
"numShards":2
}
}
}
 {code}
this todo/assumption from SOLR-11913 is getting in the way ([~dsmiley]):
{code:java}
  public void writeMap(EntryWriter ew) throws IOException {
//TODO don't call toNamedList; more efficiently implement here
//note: multiple values, if present, are a String[] under 1 key
toNamedList().forEach((k, v) -> {
 {code}
And throwing:
{code:java}
"error": {
"metadata": [
"error-class",
"org.apache.solr.common.SolrException",
"root-error-class",
"java.lang.ArrayStoreException"
],
"msg": "java.lang.ArrayStoreException: arraycopy: element type 
mismatch: can not cast one of the elements of java.lang.Object[] to the type of 
the destination array, java.lang.String",
"code": 400
}
 {code}
The reason for this is that my configuration results in a SolrParams wrapper 
instance that has non-string (List) values in the map variable ( which carried 
along by the lambda as backing for getParams(), which returns String[] and uses 
List.toArray() with a String array parameter)... I may be able to work around 
this by not using toMap() but that's probably going to be messier

> Dimensional Routed Aliases
> --
>
> Key: SOLR-13375
> URL: https://issues.apache.org/jira/browse/SOLR-13375
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
>
> Current available routed aliases are restricted to a single field. This 
> feature will allow Solr to provide data driven collection access, creation 
> and management based on multiple fields in a document. The collections will 
> be queried and updated in a unified manner via an alias. Current routing is 
> restricted to the values of a single field. The particularly useful 
> combination at this time will be Category X Time routing but Category X 
> Category may also be useful. More importantly, if additional routing schemes 
> are created in the future (either as contributions or as custom code by 
> users) combination among these should be supported. 
> It is expected that not all combinations will be useful, and that 
> determination of usefulness I expect to leave up to the user. Some Routing 
> schemes may need to be limited to be the leaf/last routing scheme for 
> technical reasons, though I'm not entirely convinced of that yet. If so, a 
> flag will be added to the RoutedAlias interface.
> Initial desire is to