Yes, I believe so.

--- Terry Steichen <[EMAIL PROTECTED]> wrote:
> Otis,
> 
> Didn't somebody (Doug?) also mention that a keyword in a shorter
> document is
> deemed more significant than in a longer one (because, I guess, it
> represents a larger percentage of the document)?
> 
> Regards,
> 
> Terry
> ----- Original Message -----
> From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
> To: "Lucene Users List" <[EMAIL PROTECTED]>;
> <[EMAIL PROTECTED]>
> Sent: Thursday, January 23, 2003 10:58 AM
> Subject: Re: Interpreting the score asociated with the Term? |
> 
> 
> > Here is a simplified explanation of some basic stuff.
> >
> > 1. the more frequent the term (in a collection) the lower its
> weight
> > (significance).  Makes sense - very popular words don't distinguish
> one
> > document from the other much, because they are present in so many
> docs.
> >
> > 2. the more frequent a word in a single document, the higher the
> > documents 'value' when the query contains that word.  So the score
> goes
> > up for frequent words in a document, esp. if they are not frequent
> in
> > other documents in the collection.
> >
> > 3. there is a boost factor which allow you to boost certain terms
> at
> > query time (e.g. you value matches in title field more than the
> body
> > field?  boost title field queries)
> >
> > 4. normalization factor, I believe, normalizes things so that
> longer
> > documents don't have advantage over shorter ones.
> >
> > There is more to this....but I am already not 100% about all of the
> > above, so I'll stop here :)
> >
> > Also note that you can boost fields at index time (you'll have to
> use
> > the nightly build for that instead of the 1.2 release to get this,
> I
> > believe).
> >
> > Otis
> >
> >
> > --- Rishabh Bajpai <[EMAIL PROTECTED]> wrote:
> > >
> > > Hi All,
> > >
> > > I am using Lucene as a Search Engine for my work. I am new to
> this,
> > > so forgive me if I am asking a cliched question!
> > >
> > > I need to understand how the SCORE for the search TERMs is
> calculated
> > > for Lucene, so that indexing can be appropriately be designed to
> > > return the most relevant results, when searched.
> > >
> > > On the official FAQ page of the Lucene site, a formula is listed
> as
> > > score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t *
> > > boost_t) * coord_q_d
> > > where:
> > >   score_d   : score for document d
> > >   sum_t     : sum for all terms t
> > >   tf_q      : the square root of the frequency of t in the query
> > >   tf_d      : the square root of the frequency of t in d
> > >   idf_t     : log(numDocs/docFreq_t+1) + 1.0
> > >   numDocs   : number of documents in index
> > >   docFreq_t : number of documents containing t
> > >   norm_q    : sqrt(sum_t((tf_q*idf_t)^2))
> > >   norm_d_t  : square root of number of tokens in d in the same
> field
> > > as t
> > >   boost_t   : the user-specified boost for term t
> > >   coord_q_d : number of terms in both query and document / number
> of
> > > terms in query
> > >
> > > I didnot find the formula too helpful in figuring out what
> exactly
> > > the score is trying to calculate.
> > >
> > > I want to know of a logic that can be used for translating this
> score
> > > into something that can be used for determining which Terms are
> more
> > > relevant for a given Search Request.
> > >
> > > One way would be to just assume that - higher the score, more
> > > relveant is the search. But is this assumption really valid? Or
> are
> > > there any possible caveats to this?
> > >
> > > -Rishabh
> > >
> > >
> > >
> > > _____________________________________________________________
> > > Get 25MB, POP3, Spam Filtering with LYCOS MAIL PLUS for
> $19.95/year.
> > >
> http://login.mail.lycos.com/brandPage.shtml?pageId=plus&ref=lmtplus
> > >
> > > --
> > > To unsubscribe, e-mail:
> > > <mailto:[EMAIL PROTECTED]>
> > > For additional commands, e-mail:
> > > <mailto:[EMAIL PROTECTED]>
> > >
> >
> >
> > __________________________________________________
> > Do you Yahoo!?
> > Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> > http://mailplus.yahoo.com
> >
> > --
> > To unsubscribe, e-mail:
> <mailto:[EMAIL PROTECTED]>
> > For additional commands, e-mail:
> <mailto:[EMAIL PROTECTED]>
> >
> >
> 
> 
> --
> To unsubscribe, e-mail:  
> <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail:
> <mailto:[EMAIL PROTECTED]>
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to