[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665204#comment-13665204 ]
Joel Bernstein commented on SOLR-4465: -------------------------------------- Greg, I had some feedback offline from Yonik about this ticket. He had concerns in two areas. The first was that allowing collectors to be specified as a query parameter was picking low level components for Solr to use. He preferred that the interface change to something less low level and more high level like "custom ranking". This would mean providing an interface that wouldn't reference collectors directly but instead reference a ranking configuration in solrconfig. The ranking config would include a collector and other parameters that would be needed to make it work. So this is mainly a cosmetic change. The second concern was that using delegating collectors for pluggable analytics clashed with both grouping and queryResultCaching, and more thought needed to be put into a generic pluggable analytics framework. My plan was to remove this capability and create a separate "collector" search component that would allow people to collect anything they wanted based on the resulting DocSet of a search. This wouldn't clash with grouping or queryResultCaching. Curious to here your thoughts on this ticket and where you'd like to see it go. > Configurable Collectors > ----------------------- > > Key: SOLR-4465 > URL: https://issues.apache.org/jira/browse/SOLR-4465 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 4.1 > Reporter: Joel Bernstein > Fix For: 4.4 > > Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch > > > This ticket provides a patch to add pluggable collectors to Solr. This patch > was generated and tested with Solr 4.1. > This is how the patch functions: > Collectors are plugged into Solr in the solconfig.xml using the new > collectorFactory element. For example: > <collectorFactory name="default" class="solr.CollectorFactory"/> > <collectorFactory name="sum" class="solr.SumCollectorFactory"/> > The elements above define two collector factories. The first one is the > "default" collectorFactory. The class attribute points to > org.apache.solr.handler.component.CollectorFactory, which implements logic > that returns the default TopScoreDocCollector and TopFieldCollector. > To create your own collectorFactory you must subclass the default > CollectorFactory and at a minimum override the getCollector method to return > your new collector. > The parameter "cl" turns on pluggable collectors: > cl=true > If cl is not in the parameters, Solr will automatically use the default > collectorFactory. > *Pluggable Doclist Sorting With the Docs Collector* > You can specify two types of pluggable collectors. The first type is the docs > collector. For example: > cl.docs=<name> > The above param points to a named collectorFactory in the solrconfig.xml to > construct the collector. The docs collectorFactorys must return a collector > that extends the TopDocsCollector base class. Docs collectors are responsible > for collecting the doclist. > You can specify only one docs collector per query. > You can pass parameters to the docs collector using local params syntax. For > example: > cl.docs=\{! sort=mycustomesort\}mycollector > If cl=true and a docs collector is not specified, Solr will use the default > collectorFactory to create the docs collector. > *Pluggable Custom Analytics With Delegating Collectors* > You can also specify any number of custom analytic collectors with the > "cl.analytic" parameter. Analytic collectors are designed to collect > something else besides the doclist. Typically this would be some type of > custom analytic. For example: > cl.analytic=sum > The parameter above specifies a analytic collector named sum. Like the docs > collectors, "sum" points to a named collectorFactory in the solrconfig.xml. > You can specificy any number of analytic collectors by adding additional > cl.analytic parameters. > Analytic collector factories must return Collector instances that extend > DelegatingCollector. > A sample analytic collector is provided in the patch through the > org.apache.solr.handler.component.SumCollectorFactory. > This collectorFactory provides a very simple DelegatingCollector that groups > by a field and sums a column of floats. The sum collector is not designed to > be a fully functional sum function but to be a proof of concept for pluggable > analytics through delegating collectors. > You can send parameters to analytic collectors with solr local param syntax. > For example: > cl.analytic=\{! id=1 groupby=field1 column=field2\}sum > The "id" parameter is mandatory for analytic collectors and is used to > identify the output from the collector. In this example the "groupby" and > "column" params tell the sum collector which field to group by and sum. > Analytic collectors are passed a reference to the ResponseBuilder and can > place maps with analytic output directory into the SolrQueryResponse with the > add() method. > Maps that are placed in the SolrQueryResponse are automatically added to the > outgoing response. The response will include a list named cl.analytic.<id>, > where id is specified in the local param. > *Distributed Search* > The CollectorFactory also has a method called merge(). This method aggregates > the results from each of the shards during distributed search. The "default" > CollectoryFactory implements the default merge logic for merging documents > from each shard. If you define a different docs collector you can override > the default merge method to merge documents in accordance with how they are > collected at the shard level. > With analytic collectors, you'll need to override the merge method to merge > the analytic output from the shards. An example of how this works is provided > in the SumCollectorFactory. > Each collectorFactory, that is specified in the http parameters, will have > its merge method applied by the Solr aggregator node. > *Testing the Patch With Sample Data* > 1) Apply patch to Solr 4.1 > 2) Load sample data > 3) Send the http command: > http://localhost:8983/solr/select?q=*:*&cl=true&facet=true&facet.field=manu_id_s&cl.analytic=%7B!+id=%271%27+groupby=%27manu_id_s%27+column=%27price%27%7Dsum > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org