The fieldNorm is lengthNorm * document boost. The final value is "rounded" so that's why you're getting such clean numbers for your fieldNorm. If you're finding that these pages have too high of a boost, you can lower indexer.score.power in your conf file.
As for your problem in #2, look at the explain page to see how your search result got there. Maybe there's a high score for an anchor match. The anchor text doesn't show up on the text of the page, so maybe that's it. Andy On 8/3/05, Howie Wang <[EMAIL PROTECTED]> wrote: > Hi, > > I've been noticing some strange search results recently. I seem > to be getting two issues. > > 1. The fieldNorm for certain terms is unusually high for certain sites > for anchors and titles. And they are usually just whole numbers (4.0, 5.0, > etc). > I find this strange since the lengthNorm used to calculate this is > very unlikely to result in an integer. It's either 1/sqrt(numTokens) or > 1/log(e+numTokens). Where is 5.0 coming from? > > 2. I'm getting hits for sites that don't contain ANY of the terms in my > search. This is exacerbated by issue #1 since the fieldNorm boosts this > page to the top of the results. I thought it might be because of my > changes for stemming, but this happens for search terms that are not > changed by stemming at all. > > Anyone run into something like this? Any ideas on how to start debugging? > > Thanks, > Howie > > > Howie > > > ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
