The fieldNorm is lengthNorm * document boost.  The final value is
"rounded" so that's why you're getting such clean numbers for your
fieldNorm.  If you're finding that these pages have too high of a
boost, you can lower indexer.score.power in your conf file.

As for your problem in #2, look at the explain page to see how your
search result got there.  Maybe there's a high score for an anchor
match.  The anchor text doesn't show up on the text of the page, so
maybe that's it.

Andy

On 8/3/05, Howie Wang <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> I've been noticing some strange search results recently. I seem
> to be getting two issues.
> 
> 1. The fieldNorm for certain terms is unusually high for certain sites
> for anchors and titles. And they are usually just whole numbers (4.0, 5.0,
> etc).
> I find this strange since the lengthNorm used to calculate this is
> very unlikely to result in an integer. It's either 1/sqrt(numTokens) or
> 1/log(e+numTokens). Where is 5.0 coming from?
> 
> 2. I'm getting hits for sites that don't contain ANY of the terms in my
> search. This is exacerbated by issue #1 since the fieldNorm boosts this
> page to the top of the results. I thought it might be because of  my
> changes for stemming, but this happens for search terms that are not
> changed by stemming at all.
> 
> Anyone run into something like this? Any ideas on how to start debugging?
> 
> Thanks,
> Howie
> 
> 
> Howie
> 
> 
>


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to