[
https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joel Bernstein updated SOLR-4465:
---------------------------------
Description:
This ticket provides a patch to add pluggable collectors to Solr. This patch
was generated and tested with Solr 4.1.
This is how the patch functions:
Collectors are plugged into Solr in the solconfig.xml using the new
collectorFactory element. For example:
<collectorFactory name="default" class="solr.CollectorFactory"/>
<collectorFactory name="sum" class="solr.SumCollectorFactory"/>
The elements above define two collector factories. The first one is the
"default" collectorFactory. The class attribute points to
org.apache.solr.handler.component.CollectorFactory, which implements logic that
returns the default TopScoreDocCollector and TopFieldCollector.
To create your own collectorFactory you must subclass the default
CollectorFactory and at a minimum override the getCollector method to return
your new collector.
You tell Solr which collectorFactory to use at query time using http
parameters. All collector parameters start with the prefix "cl.". All
parameters that start with "cl." are gathered up and added to a CollectorSpec
instance.
The parameter "cl" turns on pluggable collectors:
cl=true
If cl is not in the parameters, Solr will automatically use the default
collectorFactory.
You can specify two types of pluggable collectors. The first type is the
topdocs collector. For example:
cl.topdocs=<name>
The above param points to the named collectorFactory in the solrconfig.xml to
construct the collector. Topdocs collectorFactorys must return collectors that
extend the TopDocsCollector base class. Topdocs collectors are responsible for
collecting the doclist.
You can pass parameters to the topdocs collectors by adding "cl." http
parameters. By convention you can pass parameters to the topdocs collector like
this:
cl.topdocs.max=100
This parameter will be added to the collector spec because of the "cl." prefix
and passed to the collectorFactory.
You can also specify any number of delegating collectors with the
"cl.delegating" parameter:
cl.delegating=sum,ave
The parameter above specifies two delegating collectors named sum and ave. Like
the topdocs collectors these point to named collectorFactories in the
solrconfig.xml.
Delegating collector factories must return Collector instances that extend
DelegatingCollector.
Delegating collectors are designed to collect something else besides the
doclist. Typically this would be some kind of custom analytic.
A sample delegating collector is provided in the patch through the
org.apache.solr.handler.component.SumCollectorFactory.
This collectorFactory provides a very simple DelegatingCollector that groups by
a field and sums a column of floats. The sum collector is not designed to be a
fully functional sum function but to be a proof of concept for pluggable
analytics through delegating collectors.
To communicate with delegating collectors you need to reference the name and
ordinal of the collector.
The ordinal refers to the collectors ordinal in the comma separated list.
For example:
cl.delegating=sum,ave&cl.sum.0.groupby=field1
The "cl.sum.0.groupy" parameter tells the "sum" collector at the 0 ordinal to
group by "field1".
Delegating collectors are passed a reference to the ResponseBuilder and can
place maps with analytic output directory into the SolrQueryResponse with the
add() method.
Maps that are placed in the SolrQueryResponse are automatically added to the
outgoing response.
The CollectorFactory also has a method called merge(). This method aggregates
the results from each of the shards during distributed search. The "default"
CollectoryFactory implements the default merge logic for merging documents from
each shard. If you define a different topdocs collector you may need to change
the default merge method to merge documents in accordance with how they are
being collected at the shard level.
With delegating collectors, you'll need to overide the merge method to merge
the analytic outputs from the shards. An example of how this works is provide
in the SumCollectorFactory.
was:
This ticket provides a patch to add pluggable collectors to Solr. This patch
was generated and tested with Solr 4.1.
This is how the patch functions:
Collectors are plugged into Solr in the solconfig.xml using the new
collectorFactory element. For example:
<collectorFactory name="default" class="solr.CollectorFactory"/>
<collectorFactory name="sum" class="solr.SumCollectorFactory"/>
The elements above define two collector factories. The first one is the
"default" collectorFactory. The class attribute points to
org.apache.solr.handler.component.CollectorFactory, which implements logic that
returns the default TopScoreDocCollector and TopFieldCollector.
To create your own collectorFactory you must subclass the default
CollectorFactory and at a minimum override the getCollector method to return
your new collector.
You tell Solr which collectorFactory to use at query time using http
parameters. All collector parameters start with the prefix "cl.". All
parameters that start with "cl." are gathered up and added to a CollectorSpec
instance.
The parameter "cl" turns on pluggable collectors:
cl=true
If cl is not in the parameters, Solr will automatically use the default
collectorFactory.
You can specify two types of pluggable collectors. The first type is the
topdocs collector. For example:
cl.topdocs=<name>
The above param points to the named collectorFactory in the solrconfig.xml to
construct the collector. Topdocs collectorFactorys must return collectors that
extend the TopDocsCollector base class. Topdocs collectors are responsible for
collecting the doclist.
You can pass parameters to the topdocs collectors by adding "cl." http
parameters. By convention you can pass parameters to the topdocs collector like
this:
cl.topdocs.max=100
This parameter will be added to the collector spec because of the "cl." prefix
and passed to the collectorFactory.
You can also specify any number of delegating collectors with the
"cl.delegating" parameter:
cl.delegating=sum,ave
The parameter above specifies two delegating collectors named sum and ave. Like
the topdocs collectors these point to named collectorFactories in the
solrconfig.xml.
Delegating collector factories must extend DelegatingCollector.
Delegating collectors are designed to collect something else besides the
doclist. Typically this would be some kind of custom analytic.
A sample delegating collector is provided in the patch through the
org.apache.solr.handler.component.SumCollectorFactory.
This collectorFactory provides a very simple DelegatingCollector that groups by
a field and sums a column of floats. The sum collector is not designed to be a
fully functional sum function but to be a proof of concept for pluggable
analytics through delegating collectors.
To communicate with delegating collectors you need to reference the name and
ordinal of the collector.
The ordinal refers to the collectors ordinal in the comma separated list.
For example:
cl.delegating=sum,ave&cl.sum.0.groupby=field1
The "cl.sum.0.groupy" parameter tells the "sum" collector at the 0 ordinal to
group by "field1".
Delegating collectors are passed a reference to the ResponseBuilder and can
place maps with analytic output directory into the SolrQueryResponse with the
add() method.
Maps that are placed in the SolrQueryResponse are automatically added to the
outgoing response.
The CollectorFactory also has a method called merge(). This method aggregates
the results from each of the shards during distributed search. The "default"
CollectoryFactory implements the default merge logic for merging documents from
each shard. If you define a different topdocs collector you may need to change
the default merge method to merge documents in accordance with how they are
being collected at the shard level.
With delegating collectors, you'll need to overide the merge method to merge
the analytic outputs from the shards. An example of how this works is provide
in the SumCollectorFactory.
> Configurable Collectors
> -----------------------
>
> Key: SOLR-4465
> URL: https://issues.apache.org/jira/browse/SOLR-4465
> Project: Solr
> Issue Type: New Feature
> Components: search
> Affects Versions: 4.1
> Reporter: Joel Bernstein
> Fix For: 4.3
>
> Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch,
> SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch,
> SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch,
> SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch
>
>
> This ticket provides a patch to add pluggable collectors to Solr. This patch
> was generated and tested with Solr 4.1.
> This is how the patch functions:
> Collectors are plugged into Solr in the solconfig.xml using the new
> collectorFactory element. For example:
> <collectorFactory name="default" class="solr.CollectorFactory"/>
> <collectorFactory name="sum" class="solr.SumCollectorFactory"/>
> The elements above define two collector factories. The first one is the
> "default" collectorFactory. The class attribute points to
> org.apache.solr.handler.component.CollectorFactory, which implements logic
> that returns the default TopScoreDocCollector and TopFieldCollector.
> To create your own collectorFactory you must subclass the default
> CollectorFactory and at a minimum override the getCollector method to return
> your new collector.
> You tell Solr which collectorFactory to use at query time using http
> parameters. All collector parameters start with the prefix "cl.". All
> parameters that start with "cl." are gathered up and added to a CollectorSpec
> instance.
> The parameter "cl" turns on pluggable collectors:
> cl=true
> If cl is not in the parameters, Solr will automatically use the default
> collectorFactory.
> You can specify two types of pluggable collectors. The first type is the
> topdocs collector. For example:
> cl.topdocs=<name>
> The above param points to the named collectorFactory in the solrconfig.xml to
> construct the collector. Topdocs collectorFactorys must return collectors
> that extend the TopDocsCollector base class. Topdocs collectors are
> responsible for collecting the doclist.
> You can pass parameters to the topdocs collectors by adding "cl." http
> parameters. By convention you can pass parameters to the topdocs collector
> like this:
> cl.topdocs.max=100
> This parameter will be added to the collector spec because of the "cl."
> prefix and passed to the collectorFactory.
> You can also specify any number of delegating collectors with the
> "cl.delegating" parameter:
> cl.delegating=sum,ave
> The parameter above specifies two delegating collectors named sum and ave.
> Like the topdocs collectors these point to named collectorFactories in the
> solrconfig.xml.
> Delegating collector factories must return Collector instances that extend
> DelegatingCollector.
> Delegating collectors are designed to collect something else besides the
> doclist. Typically this would be some kind of custom analytic.
> A sample delegating collector is provided in the patch through the
> org.apache.solr.handler.component.SumCollectorFactory.
> This collectorFactory provides a very simple DelegatingCollector that groups
> by a field and sums a column of floats. The sum collector is not designed to
> be a fully functional sum function but to be a proof of concept for pluggable
> analytics through delegating collectors.
> To communicate with delegating collectors you need to reference the name and
> ordinal of the collector.
> The ordinal refers to the collectors ordinal in the comma separated list.
> For example:
> cl.delegating=sum,ave&cl.sum.0.groupby=field1
> The "cl.sum.0.groupy" parameter tells the "sum" collector at the 0 ordinal to
> group by "field1".
> Delegating collectors are passed a reference to the ResponseBuilder and can
> place maps with analytic output directory into the SolrQueryResponse with the
> add() method.
> Maps that are placed in the SolrQueryResponse are automatically added to the
> outgoing response.
> The CollectorFactory also has a method called merge(). This method aggregates
> the results from each of the shards during distributed search. The "default"
> CollectoryFactory implements the default merge logic for merging documents
> from each shard. If you define a different topdocs collector you may need to
> change the default merge method to merge documents in accordance with how
> they are being collected at the shard level.
> With delegating collectors, you'll need to overide the merge method to merge
> the analytic outputs from the shards. An example of how this works is provide
> in the SumCollectorFactory.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]