but in the javadoc expression, there no the TFIDF weight for query , juste for the document and the Cosine use the both.. hmm strange
i have a report to write about lucene and i don't know what formula write in the paper and how explain it ----- Original Message ----- From: "Karl Koch" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Sunday, January 18, 2004 11:54 PM Subject: Re: difference in javadoc and faq similarity expression > I would rely on the JavaDoc since this one is up to date. The latest version > 1.3 final is just a few weeks old. Some entries in the FAQ however are still > from 2001... > > Cheers, > Karl > > > hy, > > i have troubles in find the correspondance betwwen the javadoc and faq > > similarity expression > > > > in the Similarity Javadoc > > > > score(q,d) =Sum [tf(t in d) * idf(t) * getBoost(t.field in d) * > > lengthNorm(t.field in d) * coord(q,d) * queryNorm(q) ] > > > > in the FAQ > > > > score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t) > > * > > coord_q_d > > > > In FAQ | In Javadoc > > 1 / norm_q = queryNorm(q) > > 1 / norm_d_t=lengthNorm(t.field in d) > > coord_q_d=coord(q,d) > > boost_t=getBoost(t.field in d) > > idf_t=idf(t) > > tf_d=tf(t in d) > > > > but > > where is the javadoc expression for "tf_q" faq expression > > > > nicolas > > > > ----- Original Message ----- > > From: "Nicolas Maisonneuve" <[EMAIL PROTECTED]> > > To: "Lucene Users List" <[EMAIL PROTECTED]> > > Sent: Sunday, January 18, 2004 9:33 PM > > Subject: Re: theorical informations > > > > > > > thanks Karl ! > > > > > > ----- Original Message ----- > > > From: "Karl Koch" <[EMAIL PROTECTED]> > > > To: "Lucene Users List" <[EMAIL PROTECTED]> > > > Sent: Sunday, January 18, 2004 9:22 PM > > > Subject: Re: theorical informations > > > > > > > > > > Actually, finding an answer to this question is not really important. > > More > > > > important is if you can do what you want with it. If you result comes > > from > > > a > > > > prob. model or a vector space model, who cares if you just want to > > give > > a > > > > query and back a hit list of results? > > > > > > > > Possibliy some people here will strongly disagree... ;-) (?) > > > > > > > > Karl > > > > > > > > > Hello Nicolas, > > > > > > > > > > I am sure you mean IR (Information Retrieval) Model. Lucene > > implements > > a > > > > > Vector Space Model with integrated Boolean Model. This means the > > Boolean > > > > > model > > > > > is integrated with a Boolean query language but mapped into the > > Vector > > > > > Space. > > > > > Therefore you have ranking even though the traditional Boolean model > > > does > > > > > not > > > > > support this. Cosine similarity is used to measure similarity > > between > > > > > documents and the query. You can find this in a very long dicussion > > here > > > > > when you > > > > > search the archive... > > > > > > > > > > Karl > > > > > > > > > > > hy , > > > > > > i have 2 theorycal questions : > > > > > > > > > > > > i searched in the mailing list the R.I. model implemented in > > Lucene > > , > > > > > > but no precise answer. > > > > > > > > > > > > 1) What is the R.I model implemented in Lucene ? (ex: Boolean > > Model, > > > > > > Vector Model,Probabilist Model, etc... ) > > > > > > > > > > > > 2) What is the theory Similarity function implemented in Lucene > > > > > > (Euclidian, Cosine, Jaccard, Dice) > > > > > > > > > > > > (why this important informations is not in the Lucene Web site or > > in > > > the > > > > > > > > > > > faq ? ) > > > > > > > > > > > > > > > > -- > > > > > +++ GMX - die erste Adresse f�r Mail, Message, More +++ > > > > > Bis 31.1.: TopMail + Digicam f�r nur 29 EUR > > http://www.gmx.net/topmail > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > -- > > > > +++ GMX - die erste Adresse f�r Mail, Message, More +++ > > > > Bis 31.1.: TopMail + Digicam f�r nur 29 EUR http://www.gmx.net/topmail > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > -- > +++ GMX - die erste Adresse f�r Mail, Message, More +++ > Bis 31.1.: TopMail + Digicam f�r nur 29 EUR http://www.gmx.net/topmail > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
