3. Generate a fetch list & fetch it. When the url has been previously
fetched, and its content is unchanged, increase its fetch interval by an
amount, e.g., 50%. If the content is changed, decrease the fetch
interval. The percentage of increase and decrease might be influenced
by the url's score.
Hi,
if we would track in this way the amount of changes, we could also
prefer pages in the ranking algorithm which change more often.
Frequently changing pages might be more up-to-date and could have a
higher value then pages never change.
Also pages, which are unchanged for a long time, might run out of date
and loose a litte bit in their general scoring.
So, maybe the fetch interval value could be used as a multiplier for
boosting pages in the final result set.
Matthias
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers