Robert, Thank you for this great information. Let me look into these suggestions.
Ivan --- On Wed, 1/27/10, Robert Muir <rcm...@gmail.com> wrote: > From: Robert Muir <rcm...@gmail.com> > Subject: Re: Average Precision - TREC-3 > To: java-user@lucene.apache.org > Date: Wednesday, January 27, 2010, 2:52 PM > Hi Ivan, it sounds to me like you are > going about it the right way. > I too have complained about different document/topic > formats before, at > least with non-TREC test collections that claim to be in > TREC format. > > Here is a description of what I do, for what its worth. > > 1. if you use the trunk benchmark code, it will now parse > Descriptions and > Narratives in addition to Titles. This way you can run TD > and TDN queries. > While I think Topic only (T) queries are generally the only > interesting > value, as users only typically type a few short words in > their search, the > TD and TDN queries are sometimes useful for comparisons. so > to do this you > will have to either change SimpleQQParser or make your own, > that simply > creates a BooleanQuery of Topic + Description + Narrative > or whatever. > > 2. another thing I usually test with is query expansion > with MoreLikeThis, > all defaults, from the top 5 returned docs. I do this with > T, TD, and TDN, > for 6 different MAP measures. You can see a recent example > where I applied > all 6 measures here: https://issues.apache.org/jira/browse/LUCENE-2234 . I > feel these 6 measures give me a better overall idea of any > relative > relevance improvement, look in that example where the > unexpanded T is > improved 75%, but the other 5 its only a 40-50% > improvement. While > unexpanded T is theoretically the most realistic to me, I > feel its a bit > fragile and sensitive, and there's a good example. > > <I can contribute code to make it easier to do the above > two things if you > think it would be useful, just havent gotten around to > it> > > 3. I don't even bother with the 'summary output' that the > lucene benchmark > pkg prints out, but instead simply use the benchmark pkg to > run the queries > and generate the trec_top_file (submission.txt), which I > hand to trec_eval > > > On Wed, Jan 27, 2010 at 1:36 PM, Ivan Provalov <iprov...@yahoo.com> > wrote: > > > Robert, Grant: > > > > Thank you for your replies. > > > > Our goal is to fine-tune our existing system to > perform better on > > relevance. > > > > I agree with Robert's comment that these collections > are not completely > > compatible. Yes, it is possible that the results > will vary some depending > > on the collections differences. The reason for > us picking TREC-3 TIPSTER > > collection is that our production content overlaps > with some TIPSTER > > documents. > > > > Any suggestions on how to obtain Lucene's TREC-3 > compatible results, or > > select a better approach would be appreciated. > > > > We are doing this project in three stages: > > > > 1. Test Lucene's "vanilla" performance to establish > the baseline. We want > > to iron out the issues such as topic or document > formats. For example, we > > had to add a different parser and clean up the topic > title. This will give > > us confidence that we are using the data and the > methodology correctly. > > > > 2. Fine-tune Lucene based on the latest research > findings (TREC by E. > > Voorhees, conference proceedings, etc...). > > > > 3. Repeat these steps with our production system which > runs on Lucene. The > > reason we are doing this step last is to ensure that > our overall system > > doesn't introduce the relevance issues (content > pre-processing steps, query > > parsing steps, etc...). > > > > Thank you, > > > > Ivan Provalov > > > > --- On Wed, 1/27/10, Robert Muir <rcm...@gmail.com> > wrote: > > > > > From: Robert Muir <rcm...@gmail.com> > > > Subject: Re: Average Precision - TREC-3 > > > To: java-user@lucene.apache.org > > > Date: Wednesday, January 27, 2010, 11:16 AM > > > Hello, forgive my ignorance here (I > > > have not worked with these english TREC > > > collections), but is the TREC-3 test collection > the same as > > > the test > > > collection used in the 2007 paper you > referenced? > > > > > > It looks like that is a different collection, its > not > > > really possible to > > > compare these relevance scores across different > > > collections. > > > > > > On Wed, Jan 27, 2010 at 11:06 AM, Grant Ingersoll > <gsing...@apache.org > > >wrote: > > > > > > > > > > > On Jan 26, 2010, at 8:28 AM, Ivan Provalov > wrote: > > > > > > > > > We are looking into making some > improvements to > > > relevance ranking of our > > > > search platform based on Lucene. We > started by > > > running the Ad Hoc TREC task > > > > on the TREC-3 data using "out-of-the-box" > > > Lucene. The reason to run this > > > > old TREC-3 (TIPSTER Disk 1 and Disk 2; > topics 151-200) > > > data was that the > > > > content is matching the content of our > production > > > system. > > > > > > > > > > We are currently getting average > precision of > > > 0.14. We found some format > > > > issues with the TREC-3 data which were > causing even > > > lower score. For > > > > example, the initial average precision > number was > > > 0.9. We discovered that > > > > the topics included the word "Topic:" in > the > > > <title> tag. For example, > > > > > "<title> Topic: Coping > with > > > overcrowded prisons". By removing this > term > > > > from the queries, we bumped the average > precision to > > > 0.14. > > > > > > > > There's usually a lot of this involved in > running > > > TREC. I've also seen a > > > > good deal of improvement from things like > using phrase > > > queries and the > > > > Dismax Query Parser in Solr (which uses > > > DisjunctionQuery in Lucene, amongst > > > > other things) and by playing around with > length > > > normalization. > > > > > > > > > > > > > > > > > > Our query is based on the title tag of > the topic > > > and the index field is > > > > based on the <TEXT> tag of the > document. > > > > > > > > > > QualityQueryParser qqParser = new > > > SimpleQQParser("title", "TEXT"); > > > > > > > > > > Is there an average precision number > which > > > "out-of-the-box" Lucene should > > > > be close to? For example, this IBM's > 2007 TREC > > > paper mentions 0.154: > > > > > http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf > > > > > > > > Hard to say. I can't say I've run TREC > 3. > > > You might ask over on the Open > > > > Relevance list too (http://lucene.apache.org/openrelevance). I know > > > > Robert Muir's done a lot of experiments with > Lucene on > > > standard collections > > > > like TREC. > > > > > > > > I guess the bigger question back to you is > what is > > > your goal? Is it to get > > > > better at TREC or to actually tune your > system? > > > > > > > > -Grant > > > > > > > > > > > > -------------------------- > > > > Grant Ingersoll > > > > http://www.lucidimagination.com/ > > > > > > > > Search the Lucene ecosystem using > Solr/Lucene: > > > > http://www.lucidimagination.com/search > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > > > > > > -- > > > Robert Muir > > > rcm...@gmail.com > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > -- > Robert Muir > rcm...@gmail.com > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org