Thanks, Markus, Another question, the script will stop, right? I mean, I am not going to crawl for 100 days, I need it finish it's job. Dennis
--- On Tue, 9/28/10, Markus Jelsma <[email protected]> wrote: From: Markus Jelsma <[email protected]> Subject: Re: crawl www To: "Dennis" <[email protected]> Cc: [email protected] Date: Tuesday, September 28, 2010, 9:16 PM Oh, you don't need to crawl-urlfilter.txt. It's being used by the crawl command only and if you're about to crawl the internet (!), you will need the steps i explained in the other e-mail. You can forget about the crawl command in this case. On Tuesday 28 September 2010 14:58:32 Dennis wrote: > Sorry for interrupting, Markus, > > But I'm not quite understand. How do I "update your DB's"?, What should I > do about "crawl-urlfilter.txt"? Thanks > > > Dennis > > --- On Tue, 9/28/10, Markus Jelsma <[email protected]> wrote: > > From: Markus Jelsma <[email protected]> > Subject: Re: crawl www > To: [email protected] > Date: Tuesday, September 28, 2010, 8:19 PM > > Dennis, you shouldn't hyjack my thread ;) > > Anyway. it's all about crawl, update your DB's and recrawl and keep > repeating the same loop over and over. > > Cheers, > > On Tuesday 28 September 2010 10:08:00 Dennis wrote: > > Hi, all, > > I want to crawl the whole www, how do I config "crawl-urlfilter.txt"?It > > used to be:# accept hosts in > > MY.DOMAIN.NAME+^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ ThanksDennis > > Markus Jelsma - Technisch Architect - Buyways BV > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 > Markus Jelsma - Technisch Architect - Buyways BV http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

