Hi,
What does db.ignore.internal.links property in nutch-default.xml do?
<property>
<name>db.ignore.internal.links</name>
<value>true</value>
<description>If true, when adding new links to a page, links from
the same host are ignored. This is an effective way to limit the
size of the link database, keeping only the highest quality
links.
</description>
</property>
1. Does it effect the page rank by getting into account more pages when it
creates the page rank, or
2. It effects indexing by indexing more pages and therefore returns more results when searching
later on.
Can anybody please explain it?
Regards,
Vineet Garg