Re: SweetSpotSimiliarity

Doug Cutting Wed, 24 May 2006 09:33:58 -0700

Marvin Humphrey wrote:

The only answer seems to be to apply different lengthNorm algos todifferent fields.


FYI, Nutch uses the following:

http://svn.apache.org/viewvc/lucene/nutch/trunk/src/java/org/apache/nutch/indexer/NutchSimilarity.java?view=markup

All of this is seat-of-the-pants, developed by hand-tuning a fewqueries. Like code optimization, relevance tuning is better done withlarge amounts of real data. If you have trusted relevant/non-relevantjudgements for a representative sample of queries, then you can do amuch better job of setting these parameters. Unfortunately, suchjudgements are expensive to generate.


For Web data, one source of relevance judgements is:

http://ir.dcs.gla.ac.uk/test_collections/

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: SweetSpotSimiliarity

Reply via email to