Re: sumOfSquaredWeights for lengthNorm

Chris Hostetter Mon, 06 Mar 2006 23:18:25 -0800

: > 1) the "boosts" associated with Fields and Documents at indexing time,
: > which are combined with the lengthNorm at index time to determine a single
: > "norm"  value for the doc/field pair.
:
: I don;t think this is what I want because the lengthNorm is still using
: the # of terms.


You can override the lengthNorm function to return "1.0" for all fields
regardless of length, and then then the norm will consist soley of hte
boost.

: Yes, I noticed this, but this is not what I want because its using "idf
: of the terms being queried". What I want fieldWeight to be is to use the
: 1/ sqrt(sumOfSquaredWeights),  where  sumOfSquaredWeights = tf^2 over
: all terms in the field.

Ah, but when you are building your index, how can lucene know what the tf
for all of the terms in the field are? ... you still have more documents
to add that can affect the tf.

if you know what the frequencies are when you add the document, then you
can square that and use it as the field boost and you should have what you
want

: 3) I got another issue with the explanation, which seems to be a bug.
: Below, I;ve given a printout of the explanation.  There's something
: strange when I use my own Similarity it prints out all query terms
: despite some them not appearing in the doc (See for "formulation" the
: docFreq = 0  but it appears in the explanation).

It looks like your tf function is returning non-zero values when the input
is 0, which is going to give you real weird behavior -- including saying
that certain docs match a clause even when they don't.

Even if you want a tf to return a constant values, you have to keep in
mind the "non-matching" case and reutrn 0 when the input is 0.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: sumOfSquaredWeights for lengthNorm

Reply via email to