I'm using Nutch to index an intranet consisting of so far about 5 sites.

3 of the 5 sites interlink to each other quite heavily. In quality testing searches, this seems to me to be influencing results to those 3 domains, even where the textual matching is not that great. The end result is the result set is far from ideal.

Is there any way to tweak the Algo? I note when I click on "explain" that each domain has a starting reference point weighting - ideally I would like the facility to be able to adjust that, so that I can turn down the three "noisy" domains slightly.

Is this possible and/or documented anywhere?

Thanks,

Dean

Reply via email to