Hello,

I want to know what algorithm is used in Fetching and how Nutch creates the
Fetchlist?

For example, if a Fetchlist has 1000urls of a single host, then while
crawling the host will be accessed continuously and it might cause some
trouble to the host. So I want to know how to avoid this kind of problem and
if is it possible I want to make the Fetchlists which have entirely
different hosts.

Thank you.




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to