I'm using Nutch to index an intranet consisting of so far about 5 sites.
3 of the 5 sites interlink to each other quite heavily. In quality testing
searches, this seems to me to be influencing results to those 3 domains,
even where the textual matching is not that great. The end result is the
result set is far from ideal.
Is there any way to tweak the Algo? I note when I click on "explain" that
each domain has a starting reference point weighting - ideally I would like
the facility to be able to adjust that, so that I can turn down the three
"noisy" domains slightly.
Is this possible and/or documented anywhere?
Thanks,
Dean