[
https://issues.apache.org/jira/browse/LUCENE-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780607#action_12780607
]
Grant Ingersoll commented on LUCENE-965:
----------------------------------------
Hi Hui,
I see you updated your paper on this, have you looked at how this might be
implemented given the flexible indexing work under way?
> Implement a state-of-the-art retrieval function in Lucene
> ---------------------------------------------------------
>
> Key: LUCENE-965
> URL: https://issues.apache.org/jira/browse/LUCENE-965
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Search
> Affects Versions: 2.2
> Reporter: Hui Fang
> Fix For: 3.1
>
> Attachments: axiomaticFunction.patch
>
>
> We implemented the axiomatic retrieval function, which is a state-of-the-art
> retrieval function, to
> replace the default similarity function in Lucene. We compared the
> performance of these two functions and reported the results at
> http://sifaka.cs.uiuc.edu/hfang/lucene/Lucene_exp.pdf.
> The report shows that the performance of the axiomatic retrieval function is
> much better than the default function. The axiomatic retrieval function is
> able to find more relevant documents and users can see more relevant
> documents in the top-ranked documents. Incorporating such a state-of-the-art
> retrieval function could improve the search performance of all the
> applications which were built upon Lucene.
> Most changes related to the implementation are made in AXSimilarity,
> TermScorer and TermQuery.java. However, many test cases are hand coded to
> test whether the implementation of the default function is correct. Thus, I
> also made the modification to many test files to make the new retrieval
> function pass those cases. In fact, we found that some old test cases are not
> reasonable. For example, in the testQueries02 of TestBoolean2.java,
> the query is "+w3 xx", and we have two documents "w1 xx w2 yy w3" and "w1 w3
> xx w2 yy w3".
> The second document should be more relevant than the first one, because it
> has more
> occurrences of the query term "w3". But the original test case would require
> us to rank
> the first document higher than the second one, which is not reasonable.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]