Hello, I want to know what algorithm is used in Fetching and how Nutch creates the Fetchlist?
For example, if a Fetchlist has 1000urls of a single host, then while crawling the host will be accessed continuously and it might cause some trouble to the host. So I want to know how to avoid this kind of problem and if is it possible I want to make the Fetchlists which have entirely different hosts. Thank you. ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
