[
https://issues.apache.org/jira/browse/SOLR-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alon Lanyado updated SOLR-5038:
-------------------------------
Component/s: (was: MoreLikeThis)
> Diversity Search Result In Rank
> -------------------------------
>
> Key: SOLR-5038
> URL: https://issues.apache.org/jira/browse/SOLR-5038
> Project: Solr
> Issue Type: New Feature
> Components: SearchComponents - other
> Environment: irelevant
> Reporter: Alon Lanyado
> Labels: features
> Original Estimate: 120h
> Remaining Estimate: 120h
>
> We would like to add a Diversity SearchComponent/RequestHandler for Solr.
> We will implement MMR(Maximal Marginal Relevance) which is one of the
> simplest algorithms for this problem, in the next version we will improve it.
> The Idea is that you have a lot of similar documents in your search result
> (duplicates and near-duplicates that you must index) and the rank is showing
> all those documents one by one - it's a very common problem for organizations.
> We need to return a bigger list of documents from the searcher (a parameter
> need to be chosen based on system performance) run MMR calculation in their
> scoring:
> lamda * OldRank + (1-lamda)*min_similarity{similarity of current document to
> the subset of documents already chosen to return in search results}
> lamda is parameter between 0-1 - the strong of the diversity.
> min_similarity is calculated based on lucene default similarity (TF-IDF) for
> the subset of already chosen documents.
> The new score will represent a combination of relevance score and diversity
> from other documents.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]