Implement a state-of-the-art retrieval function in Lucene
---------------------------------------------------------

                 Key: LUCENE-965
                 URL: https://issues.apache.org/jira/browse/LUCENE-965
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Search
    Affects Versions: 2.2
            Reporter: Hui Fang


We implemented the axiomatic retrieval function, which is a state-of-the-art 
retrieval function, to 
replace the default similarity function in Lucene. We compared the performance 
of these two functions and reported the results at 
http://sifaka.cs.uiuc.edu/hfang/lucene/Lucene_exp.pdf. 
The report shows that the performance of the axiomatic retrieval function is 
much better than the default function. The axiomatic retrieval function is able 
to find more relevant documents and users can see more relevant documents in 
the top-ranked documents. Incorporating such a state-of-the-art retrieval 
function could improve the search performance of all the applications which 
were built upon Lucene. 

Most changes related to the implementation are made in AXSimilarity, TermScorer 
and TermQuery.java.  However, many test cases are hand coded to test whether 
the implementation of the default function is correct. Thus, I also made the 
modification to many test files to make the new retrieval function pass those 
cases. In fact, we found that some old test cases are not reasonable. For 
example, in the testQueries02 of TestBoolean2.java, 
the query is "+w3 xx", and we have two documents "w1 xx w2 yy w3" and "w1 w3 xx 
w2 yy w3". 
The second document should be more relevant than the first one, because it has 
more 
occurrences of the query term "w3". But the original test case would require us 
to rank 
the first document higher than the second one, which is not reasonable. 



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to