Those are interesting papers, especially the one by Robertson.  There are subtle 
variations in the specific form of idf, but in all of the models presented the term is 
linear, not quadratic.  Robertson's theoretical arguments justify a linear term.

  > -----Original Message-----
  > From: Antonio Gulli [mailto:[EMAIL PROTECTED]
  > Sent: Friday, October 22, 2004 9:59 AM
  > To: Lucene Developers List
  > Subject: Re: Contribution: better multi-field searching
  > 
  > 
  > >
  > > If someone can demonstrate that an alternate formulation produces
  > > superior results for most applications, then we should of course
  > > change the default implementation.  But just noting that there's a
  > > factor which is equal to idf^2 in each element of the sum does not do
  > > this.
  > 
  > Dont think that there is a magic formula, but found these papers
  > interesting.
  > http://www.emeraldinsight.com/rpsv/cgi-bin/emft.pl
  > 
  > Title: Understanding inverse document frequency: on theoretical
  > arguments
  > for IDF
  > Author: Stephen Robertson
  > Pages: 503-520
  > 
  > Title: IDF term weighting and IR research lessons
  > Author: Karen Spärck Jones
  > Pages: 521-523
  > 
  > 
  > 
  > ---------------------------------------------------------------------
  > To unsubscribe, e-mail: [EMAIL PROTECTED]
  > For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to