I'm sorry - scrub the last message, I found the crawl-tool.xml file which I
think will help.
Given that this is an intranet, and all pages are "trusted" - i.e. none is
an authority over the other, and there is no spam present at all in the
index, I wonder if I can simply remove the inbound link score factor
entirely, and keep it to basic on-page factors?
In other words, I turn this off:-
<property>
<name>indexer.boost.by.link.count</name>
<value>true</value>
<description>When true scores for a page are multipled by the log of
the number of incoming links to the page.</description>
</property>
Any thoughts on doing this?
Also, is it possible to reindex without doing a re-crawl, so that I can do
some testing?
I'm using the basic process for an intranet crawl, and I'm very new to
Nutch!
Thanks,
Dean