Hello Hui:
Thank you for contributing your axiomatic retrieval function to Lucene.
Can not wait for the test drive :)
Would you please report your setting for your experiment on
Collection Function MAP [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED] NumRR
ROBUST04 Lucene Default 0.048 0.12 0.09 0.08 0.05 21
Since there is a disparity comparing with mine.
num_q 249
num_ret 239436
num_rel 17412
num_rel_ret 9780
map 0.2076
gm_ap 0.1049
R-prec 0.2551
bpref 0.2189
recip_rank 0.5684
ircl_prn.0.00 0.6288
ircl_prn.0.10 0.4459
ircl_prn.0.20 0.3562
ircl_prn.0.30 0.2864
ircl_prn.0.40 0.2289
ircl_prn.0.50 0.1925
ircl_prn.0.60 0.145
ircl_prn.0.70 0.1062
ircl_prn.0.80 0.0702
ircl_prn.0.90 0.0461
ircl_prn.1.00 0.0261
P5 0.3944
P10 0.3598
P15 0.3307
P20 0.307
P30 0.2657
P100 0.1618
P200 0.1117
P500 0.0635
P1000 0.0393
Before I go further, let us make sure we are in the same page.
Here is my setting:
Data: TREC Disk 4 & 5; 528,155 documents; 1,904 MB of text.
Query Number: TREC Query Number 301-700
Query Field: <title> only
IR Engine: Lucene 2.0 (need double check, but close:)
Note: default Lucene similarity function, using title words only.
If we are in the same page, then 0.048 MAP score is terribly low for
301-700, whereas 0.2076 in mine.
Still your axiomatic retrieval function outperformed the default in many
other aspects. So if you would like to check your experimental setting,
and if my result is more closer to the real default, then we might
discover a further improvement with the axiomatic retrieval function.
That is my hope.
Charlie Zhao
Implement a state-of-the-art retrieval function in Lucene
---------------------------------------------------------
Key: LUCENE-965
URL: https://issues.apache.org/jira/browse/LUCENE-965
Project: Lucene - Java
Issue Type: Improvement
Components: Search
Affects Versions: 2.2
Reporter: Hui Fang
Attachments: axiomaticFunction.patch
We implemented the axiomatic retrieval function, which is a state-of-the-art retrieval function, to
replace the default similarity function in Lucene. We compared the performance of these two functions and reported the results at http://sifaka.cs.uiuc.edu/hfang/lucene/Lucene_exp.pdf.
The report shows that the performance of the axiomatic retrieval function is much better than the default function. The axiomatic retrieval function is able to find more relevant documents and users can see more relevant documents in the top-ranked documents. Incorporating such a state-of-the-art retrieval function could improve the search performance of all the applications which were built upon Lucene.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]