Scoring

Chris Sibert Sun, 01 Sep 2002 23:47:10 -0700

I am disatisfied with the document scores that I'm getting. If a document is short, 
and has one occurrence of the search term, it is ranked higher than a longer document 
with two occurrences of the term. This makes little sense to me, and I'd like the 
longer document with more occurrences to be ranked higher. I figured I have to 
override the scoring method, but I can't find where Lucene actually does the scoring. 
This is actually not an uncommon problem for me, as I find perusing the API to be high 
on the confusing scale, due to the lack of comprehensive Javadoc documentation. 
(Something that even Sun doesn't spend much time on.) I attempt to read the code, but 
variable names are terse, and there's a dearth of commenting, which makes it fairly 
unfathomable.


This is the code that I'm using. Am I doing the right thing in using the Query object, 
or should I be using a different one, such as TermQuery ? Does TermQuery score 
differently, so that I might be happier with it's behavior ? If not, where might I 
find the method that actually computes the Document's score, so that I may modify it ? 


    Hits     find ( String  string_searchString, String string_indexPath )
    {
        Searcher            indexSearcher    ;
        Analyzer            analyzer    ;
        Query                query       ;
        QueryParser      queryParser ;
        Hits                   searchResults_Hits        ;

        try
        {
            indexSearcher         = new   IndexSearcher ( string_indexPath ) ;
            analyzer                 = new   SimpleAnalyzer ()       ;

            query                      = QueryParser.parse ( string_searchString, 
"DocumentText", analyzer )    ;
            searchResults_Hits  = indexSearcher.search ( query )     ;
            
            return  searchResults_Hits ;
        }

Scoring

Reply via email to