[ https://issues.apache.org/jira/browse/SOLR-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kaleem Ahmed closed SOLR-2953. ------------------------------ Resolution: Not A Problem Closing as the 4.0 has this feature already implemented through similarity pacakage classes > Introducing hit Count as an alternative to score > ------------------------------------------------- > > Key: SOLR-2953 > URL: https://issues.apache.org/jira/browse/SOLR-2953 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 4.0 > Reporter: Kaleem Ahmed > Labels: features > Fix For: 4.0 > > Original Estimate: 1,008h > Remaining Estimate: 1,008h > > As of now we have score as relevancy factor for a query against a document, > and this score is relative to the number of documents in the index. In the > same way why not have some other relevancy feature say "hitCounts" which is > absolute for a given doc and a given query, It shouldn't depend on the number > of documents in the index. This will help a lot for the frequently changing > indexes , where the search rules are predefined along the relevancy factor > for a document to be qualified for that query(search rule). > Ex: consider a use case where a list of queries are formed with a threshold > number for each query and these are searched on a frequently updated index to > get the documents that score above the threshold i.e. when a document's > relevancy factor crosses the threshold for a query the document is said to be > qualified for that query. > For the above use case to satisfy the score shouldn't change every time the > index gets updated with new documents. So we introduce new feature called > "hitCount" which represents the relevancy of a document against a query and > it is absolute(won't change with index size). > This hitCount is a positive integer and is calculated as follows > Ex: Document with text "the quick fox jumped over the lazy dog, while the > lazy dog was too lazy to care" > 1. for the query "lazy AND dog" the hitCount will be == (no of occurrences of > "lazy" in the document) + (no of occurrences of "dog" in the document) => > 3+2 => 5 > 2. for the phrase query \"lazy dog\" the hitCount will be == (no of > occurrences of exact phrase "lazy dog" in the document) => 2 > This will be very useful as an alternative scoring mechanism. > I already implemented this whole thing in the Solr source code(that I > downloaded) and we are using it. So far it's going good. > It would be really great if this feature is added to trunk (original Solr) > so that we don't have to implement the changes every time a new version is > released and also others could be benefited with this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org