Just to add my 2 cents, for the most part if you have
a decent nic card you could issue OS commands to drop
the port rate of your interface to 10mbit and not
waste cpu cycles on shaping/proxying.

Although i do recommend squid for this since i too use
it to further filter/offload regex/hostname blocks as
well.

-byron

--- Jay Pound <[EMAIL PROTECTED]> wrote:

> there are a number of linux packages for QOS/traffic
> shaping, my favorite is
> wondershaper, I havent set it up since the 2.4
> kernel but it works well,
> also if your not inclined to do something that
> involved, your isp can give
> that machine's ip address a car statement in
> your/their cisco router
> preventing that particular machine from using a max
> of x bandwidth. or the
> do it yourself solution buy the cheapest POS router
> (linksys, generic) they
> wont be able to route 20mbit of data through the
> nat, at least older ones
> couldent get much more than 5mbit or so (newer ones
> can do 9mbit +) so there
> are some solutions to your problem.
> -J
> ----- Original Message ----- 
> From: "Insurance Squared Inc."
> <[EMAIL PROTECTED]>
> To: <[email protected]>
> Sent: Monday, January 16, 2006 6:02 PM
> Subject: throttling bandwidth
> 
> 
> > My ISP called and said my nutch crawler is chewing
> up 20mbits on a line
> > he's only supposed to be using 10.   Is there an
> easy way to tinker with
> > how much bandwidth we're using at once?  I know we
> can change the number
> > of open threads the crawler has, but it seems to
> me this won't make a
> > huge difference.  If I chop the number of open
> threads in half, it'll
> > just download half the pages, twice as fast?  I
> stand to be corrected on
> > this.
> >
> > Any other thoughts? doesn't have to be correct or
> elegant as long as it
> > works.
> >
> > Failing a reasonable solution in nutch, is there
> some sort of linux
> > level tool that will easily allow me to throttle
> how much bandwidth the
> > crawl is using at once?
> >
> > Thanks.
> >
> >
> >
> > -- 
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> >
> >
> 
> 
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
> 
> 



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to