Ken Krugler wrote:
It seems that the default behavior of Nutch when sorting links to fetch
is to use scoreByLinkCount. This then sets the fetch score for links on
a page to be the same as the containing page's "in-bound link" score (or
actually the log of same).
Please also see:
http://issues.apache.org/jira/browse/NUTCH-61
This is an extensible mechanism for altering the fetch schedule.
Similarly, we need an extensible mechanism for computing page scores,
which are used to prioritize the fetching of scheduled pages. Note that
the scoring mechanism has changed substantially in the development trunk
from what is in the 0.7 release.
Doug
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers