you setup your own dns server, a separate machine to your crawling box, it doesn't have to be powerful, it can be a 500mhz Pentium 3, but you need to have at least 512mb of ram in it, 1gb recommended, you point your fetcher machine to the dns server as its primary dns server and presto internal dns caching!!! -J PS: the easiest dns server to setup if your a windows person is windows 2000 server or windows 2003 server, you just enable it and it runs, there are many dns servers for linux, most distributions come with it on cd, mac osx server has it also. ----- Original Message ----- From: "Stefan Groschupf" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Wednesday, August 03, 2005 11:05 AM Subject: Re: dns lookup cache?
> How you do 'internal' domain caching? > Thanks. > Stefan > Am 03.08.2005 um 16:51 schrieb Jay Pound: > > > I've got a fast internal dns cache so nutch wont need one, and it > > did stop a > > lot of the errors with nutch host not found-timeout, most isp's dns > > server > > is bogged down allready by client requests, if you dump 10000 > > clients worth > > of dns traffic they can break or not return results so I made my own > > internal dns server cache, the machine a quad xeon 4gb ram uses > > over 500mb > > of ram just for caching of the domains in memory!!! > > -Jay > > > > ----- Original Message ----- > > From: "Stefan Groschupf" <[EMAIL PROTECTED]> > > To: <[email protected]> > > Sent: Wednesday, August 03, 2005 4:19 AM > > Subject: dns lookup cache? > > > > > > > >> Hi there, > >> does anyhow nutch cache dns lookups. > >> I found this paper and section 3.7 gives some very interesting > >> information. > >> We notice that our crawlers often crash after a set of unknown host > >> exceptions. > >> We have already one dual cpu box with a 1Gbit network connection > >> running BIND. > >> > >> So I have 2 questions: > >> People think is may java domain lookup may be a bottleneck that > >> crashs the crawlers? > >> Other crawlers have a kind of dns cache would that make sense to > >> introduce it to nutch as well? > >> > >> Thanks for any comments. > >> Stefan > >> > >> > >> > > > > > > > > > > ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
