[
https://issues.apache.org/jira/browse/SOLR-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187574#comment-13187574
]
Jan Høydahl commented on SOLR-2827:
-----------------------------------
Example usage:
{code}
<processor class="org.apache.solr.update.processor.RegexpBoostProcessorFactory">
<bool name="enabled">true</bool>
<str name="inputField">url</str>
<str name="boostField">urlboost</str>
<str name="boostFilename">${solr.solr.home}/conf/rank/urlboosts.txt</str>
</processor>
{code}
Sample urlboosts.txt file:
{noformat}
# Sample config file for RegexBoostProcessor
# This example applies boost on the "url" field to boost or deboost certain urls
# All rules are evaluated, and if several of them match, the boosts are
multiplied.
# If for example one rule with boost 2.0 and one rule with boost 0.1 match, the
resulting urlboost=0.2
https?://[^/]+/old/.* 0.1 #Comments are removed
https?://[^/]+/.*index\([0-9]\).html$ 0.5
# Prioritize certain sites over others
https?://www.mydomain.no/.* 1.5
{noformat}
The output boost field can then be used query time to tune relevance.
> RegexpBoost Update Processor
> ----------------------------
>
> Key: SOLR-2827
> URL: https://issues.apache.org/jira/browse/SOLR-2827
> Project: Solr
> Issue Type: New Feature
> Components: update
> Reporter: Jan Høydahl
> Labels: UpdateProcessor
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-2827.patch
>
>
> Processor which reads a string field and outputs a float field with a boost
> value if the input string matched one of several RegEx.
> The processor reads a separate file with one RegEx per line with associated
> boost value.
> We used it to (de)boost web pages based on URL patterns. Could be used for
> many other use cases as well
> Kindly donated by Oslo University
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]