For ISPs around-the-world, thew most important thing is the Number of Active
TCP Sessions.

Such manufacturers as CISCO sell/license their hardware with different
options: 1024 sessions, 65536 sessions, etc.

Backbones are shared between users, and you can kill others using 1024
sessions.

ISPs don't like such "download accelerators" as wGet which use few TCP
sessions for a single file download.


-----Original Message-----
From: Insurance Squared Inc. [mailto:[EMAIL PROTECTED] 
Sent: Monday, January 16, 2006 6:03 PM
To: [email protected]
Subject: throttling bandwidth


My ISP called and said my nutch crawler is chewing up 20mbits on a line 
he's only supposed to be using 10.   Is there an easy way to tinker with 
how much bandwidth we're using at once?  I know we can change the number 
of open threads the crawler has, but it seems to me this won't make a 
huge difference.  If I chop the number of open threads in half, it'll 
just download half the pages, twice as fast?  I stand to be corrected on 
this.

Any other thoughts? doesn't have to be correct or elegant as long as it 
works. 

Failing a reasonable solution in nutch, is there some sort of linux 
level tool that will easily allow me to throttle how much bandwidth the 
crawl is using at once?

Thanks.




Reply via email to