RE: Relevance percentage

Chuck Williams Mon, 20 Dec 2004 09:32:48 -0800

The coord() value is not saved anywhere so you would need to recompute
it.  You could either call explain() and parse the result string, or
better, look at explain() and implement what it does more efficiently
just for coord().  If your queries are all BooleanQuery's of
TermQuery's, then this is very simple.  Iterate down the list of
BooleanClause's and count the number whose score is > 0, then divide
this by the total number of clauses.  Take a look at
BooleanQuery.BooleanWeight.explain() as it does this (along with
generating the rest of the explanation).  If you support the full Lucene
query language, then you need to look at all the query types and decide
what exactly you want to compute (as coord is not always well-defined).


I'm on the West Coast of the U.S. so evidently on a very different time
zone from you -- will look at your other message next.

Chuck

  > -----Original Message-----
  > From: Gururaja H [mailto:[EMAIL PROTECTED]
  > Sent: Monday, December 20, 2004 6:10 AM
  > To: Lucene Users List; Mike Snare
  > Subject: Re: Relevance percentage
  > 
  > Hi,
  > 
  > But, How to calculate the coord() fraction ?  I know by default,
  > in DefaultSimilarity the coord() fraction is defined as below:
  > 
  > /** Implemented as <code>overlap / maxOverlap</code>. */
  > 
  > public float coord(int overlap, int maxOverlap) {
  > 
  > return overlap / (float)maxOverlap;
  > 
  > }
  > How to get the overlap and maxOverlap value in each of the matched
  > document(s) ?
  > 
  > Thanks,
  > Gururaja
  > 
  > Mike Snare <[EMAIL PROTECTED]> wrote:
  > I'm still new to Lucene, but wouldn't that be the coord()? My
  > understanding is that the coord() is the fraction of the boolean
query
  > that matched a given document.
  > 
  > Again, I'm new, so somebody else will have to confirm or deny...
  > 
  > -Mike
  > 
  > 
  > On Mon, 20 Dec 2004 00:33:21 -0800 (PST), Gururaja H
  > wrote:
  > > How to find out the percentages of matched terms in the
document(s)
  > using Lucene ?
  > > Here is an example, of what i am trying to do:
  > > The search query has 5 terms(ibm, risc, tape, dirve, manual) and
there
  > are 4 matching
  > > documents with the following attributes:
  > > Doc#1: contains terms(ibm,drive)
  > > Doc#2: contains terms(ibm,risc, tape, drive)
  > > Doc#3: contains terms(ibm,risc, tape,drive)
  > > Doc#4: contains terms(ibm, risc, tape, drive, manual).
  > > The percentages displayed would be 100%(Doc#4), 80%(doc#2),
80%(doc#3)
  > and 40%
  > > (doc#1).
  > >
  > > Any help on how to go about doing this ?
  > >
  > > Thanks,
  > > Gururaja
  > >
  > >
  > > ---------------------------------
  > > Do you Yahoo!?
  > > Send a seasonal email greeting and help others. Do good.
  > >
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: [EMAIL PROTECTED]
  > For additional commands, e-mail: [EMAIL PROTECTED]
  > 
  > 
  > 
  > ---------------------------------
  > Do you Yahoo!?
  >  All your favorites on one personal page - Try My Yahoo!

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Relevance percentage

Reply via email to