[ 
https://issues.apache.org/jira/browse/SOLR-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211798#comment-13211798
 ] 

Kaleem Ahmed commented on SOLR-2953:
------------------------------------

Looks like the present trunk 4.0 has the feature of implementing our own score 
through a plugin by overriding the similarity package's 
DefaultSimilarityProvider class. so I guess the change is not required through 
a patch.

The changes that I've made were on the 3.5 version which won't be compatible 
with the present trunk. So closing this issue.
                
> Introducing hit Count as an alternative to score 
> -------------------------------------------------
>
>                 Key: SOLR-2953
>                 URL: https://issues.apache.org/jira/browse/SOLR-2953
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Kaleem Ahmed
>              Labels: features
>             Fix For: 4.0
>
>   Original Estimate: 1,008h
>  Remaining Estimate: 1,008h
>
> As of now we have score as relevancy factor for a query against a document, 
> and this score is relative to the number of documents in the index. In the 
> same way why not have some other relevancy feature say "hitCounts" which is 
> absolute for a given doc and a given query, It shouldn't depend on the number 
> of documents in the index. This will help a lot for the frequently changing 
> indexes , where the search rules are predefined along the relevancy factor 
> for a document to be qualified for that query(search rule). 
> Ex: consider a use case where a list of queries are formed with a threshold 
> number for each query and these are searched on a frequently updated index to 
> get the documents that score above the threshold i.e. when a document's 
> relevancy factor crosses the threshold for a query the document is said to be 
> qualified for that query. 
> For the above use case to satisfy the score shouldn't change every time the 
> index gets updated with new documents. So we introduce new feature called 
> "hitCount"  which represents the relevancy of a document against a query and 
> it is absolute(won't change with index size). 
> This hitCount is a positive integer and is calculated as follows 
> Ex: Document with text "the quick fox jumped over the lazy dog, while the 
> lazy dog was too lazy to care" 
> 1. for the query "lazy AND dog" the hitCount will be == (no of occurrences of 
> "lazy" in the document) +  (no of occurrences of "dog" in the document)  =>  
> 3+2 => 5  
> 2. for the phrase query  \"lazy dog\"  the hitCount will be == (no of 
> occurrences of exact phrase "lazy dog" in the document) => 2
> This will be very useful  as an alternative scoring mechanism.
> I already implemented this whole thing in the Solr source code(that I 
> downloaded) and we are using it. So far it's going good. 
> It would be really great if this feature is added to trunk (original  Solr) 
> so that we don't have to implement the changes every time  a new version is 
> released and also others could be benefited with this.     

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to