Re: Average Precision - TREC-3

Ivan Provalov Wed, 27 Jan 2010 18:42:38 -0800

Robert,

Thank you for this great information.  Let me look into these suggestions.


Ivan

--- On Wed, 1/27/10, Robert Muir <rcm...@gmail.com> wrote:

> From: Robert Muir <rcm...@gmail.com>
> Subject: Re: Average Precision - TREC-3
> To: java-user@lucene.apache.org
> Date: Wednesday, January 27, 2010, 2:52 PM
> Hi Ivan, it sounds to me like you are
> going about it the right way.
> I too have complained about different document/topic
> formats before, at
> least with non-TREC test collections that claim to be in
> TREC format.
> 
> Here is a description of what I do, for what its worth.
> 
> 1. if you use the trunk benchmark code, it will now parse
> Descriptions and
> Narratives in addition to Titles. This way you can run TD
> and TDN queries.
> While I think Topic only (T) queries are generally the only
> interesting
> value, as users only typically type a few short words in
> their search, the
> TD and TDN queries are sometimes useful for comparisons. so
> to do this you
> will have to either change SimpleQQParser or make your own,
> that simply
> creates a BooleanQuery of Topic + Description + Narrative
> or whatever.
> 
> 2. another thing I usually test with is query expansion
> with MoreLikeThis,
> all defaults, from the top 5 returned docs. I do this with
> T, TD, and TDN,
> for 6 different MAP measures. You can see a recent example
> where I applied
> all 6 measures here: https://issues.apache.org/jira/browse/LUCENE-2234 . I
> feel these 6 measures give me a better overall idea of any
> relative
> relevance improvement, look in that example where the
> unexpanded T is
> improved 75%, but the other 5 its only a 40-50%
> improvement. While
> unexpanded T is theoretically the most realistic to me, I
> feel its a bit
> fragile and sensitive, and there's a good example.
> 
> <I can contribute code to make it easier to do the above
> two things if you
> think it would be useful, just havent gotten around to
> it>
> 
> 3. I don't even bother with the 'summary output' that the
> lucene benchmark
> pkg prints out, but instead simply use the benchmark pkg to
> run the queries
> and generate the trec_top_file (submission.txt), which I
> hand to trec_eval
> 
> 
> On Wed, Jan 27, 2010 at 1:36 PM, Ivan Provalov <iprov...@yahoo.com>
> wrote:
> 
> > Robert, Grant:
> >
> > Thank you for your replies.
> >
> > Our goal is to fine-tune our existing system to
> perform better on
> > relevance.
> >
> > I agree with Robert's comment that these collections
> are not completely
> > compatible.  Yes, it is possible that the results
> will vary some depending
> > on the collections differences.  The reason for
> us picking TREC-3 TIPSTER
> > collection is that our production content overlaps
> with some TIPSTER
> > documents.
> >
> > Any suggestions on how to obtain Lucene's TREC-3
> compatible results, or
> > select a better approach would be appreciated.
> >
> > We are doing this project in three stages:
> >
> > 1. Test Lucene's "vanilla" performance to establish
> the baseline.  We want
> > to iron out the issues such as topic or document
> formats.  For example, we
> > had to add a different parser and clean up the topic
> title.  This will give
> > us confidence that we are using the data and the
> methodology correctly.
> >
> > 2. Fine-tune Lucene based on the latest research
> findings (TREC by E.
> > Voorhees, conference proceedings, etc...).
> >
> > 3. Repeat these steps with our production system which
> runs on Lucene.  The
> > reason we are doing this step last is to ensure that
> our overall system
> > doesn't introduce the relevance issues (content
> pre-processing steps, query
> > parsing steps, etc...).
> >
> > Thank you,
> >
> > Ivan Provalov
> >
> > --- On Wed, 1/27/10, Robert Muir <rcm...@gmail.com>
> wrote:
> >
> > > From: Robert Muir <rcm...@gmail.com>
> > > Subject: Re: Average Precision - TREC-3
> > > To: java-user@lucene.apache.org
> > > Date: Wednesday, January 27, 2010, 11:16 AM
> > > Hello, forgive my ignorance here (I
> > > have not worked with these english TREC
> > > collections), but is the TREC-3 test collection
> the same as
> > > the test
> > > collection used in the 2007 paper you
> referenced?
> > >
> > > It looks like that is a different collection, its
> not
> > > really possible to
> > > compare these relevance scores across different
> > > collections.
> > >
> > > On Wed, Jan 27, 2010 at 11:06 AM, Grant Ingersoll
> <gsing...@apache.org
> > >wrote:
> > >
> > > >
> > > > On Jan 26, 2010, at 8:28 AM, Ivan Provalov
> wrote:
> > > >
> > > > > We are looking into making some
> improvements to
> > > relevance ranking of our
> > > > search platform based on Lucene.  We
> started by
> > > running the Ad Hoc TREC task
> > > > on the TREC-3 data using "out-of-the-box"
> > > Lucene.  The reason to run this
> > > > old TREC-3 (TIPSTER Disk 1 and Disk 2;
> topics 151-200)
> > > data was that the
> > > > content is matching the content of our
> production
> > > system.
> > > > >
> > > > > We are currently getting average
> precision of
> > > 0.14.  We found some format
> > > > issues with the TREC-3 data which were
> causing even
> > > lower score.  For
> > > > example, the initial average precision
> number was
> > > 0.9.  We discovered that
> > > > the topics included the word "Topic:" in
> the
> > > <title> tag.  For example,
> > > > > "<title> Topic:  Coping
> with
> > > overcrowded prisons".  By removing this
> term
> > > > from the queries, we bumped the average
> precision to
> > > 0.14.
> > > >
> > > > There's usually a lot of this involved in
> running
> > > TREC.  I've also seen a
> > > > good deal of improvement from things like
> using phrase
> > > queries and the
> > > > Dismax Query Parser in Solr (which uses
> > > DisjunctionQuery in Lucene, amongst
> > > > other things) and by playing around with
> length
> > > normalization.
> > > >
> > > >
> > > > >
> > > > > Our query is based on the title tag of
> the topic
> > > and the index field is
> > > > based on the <TEXT> tag of the
> document.
> > > > >
> > > > > QualityQueryParser qqParser = new
> > > SimpleQQParser("title", "TEXT");
> > > > >
> > > > > Is there an average precision number
> which
> > > "out-of-the-box" Lucene should
> > > > be close to?  For example, this IBM's
> 2007 TREC
> > > paper mentions 0.154:
> > > > > http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf
> > > >
> > > > Hard to say.  I can't say I've run TREC
> 3.
> > > You might ask over on the Open
> > > > Relevance list too (http://lucene.apache.org/openrelevance).  I know
> > > > Robert Muir's done a lot of experiments with
> Lucene on
> > > standard collections
> > > > like TREC.
> > > >
> > > > I guess the bigger question back to you is
> what is
> > > your goal?  Is it to get
> > > > better at TREC or to actually tune your
> system?
> > > >
> > > > -Grant
> > > >
> > > >
> > > > --------------------------
> > > > Grant Ingersoll
> > > > http://www.lucidimagination.com/
> > > >
> > > > Search the Lucene ecosystem using
> Solr/Lucene:
> > > > http://www.lucidimagination.com/search
> > > >
> > > >
> > > >
> > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > > >
> > > >
> > >
> > >
> > > --
> > > Robert Muir
> > > rcm...@gmail.com
> > >
> >
> >
> >
> >
> >
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
> 
> 
> -- 
> Robert Muir
> rcm...@gmail.com
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Average Precision - TREC-3

Reply via email to