On Mon, 2006-01-16 at 18:02 -0500, Insurance Squared Inc. wrote: > My ISP called and said my nutch crawler is chewing up 20mbits on a line > he's only supposed to be using 10. Is there an easy way to tinker with > how much bandwidth we're using at once? I know we can change the number > of open threads the crawler has, but it seems to me this won't make a > huge difference. If I chop the number of open threads in half, it'll > just download half the pages, twice as fast? I stand to be corrected on > this.
Bump the delay between pages and drop the number of threads by 10 fold. Start increasing the thread count from there until you hit your target. I've found I can get within 5% of my target bandwidth this way. -- Rod Taylor <[EMAIL PROTECTED]> ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
