Thanks, Markus,
Another question, the script will stop, right? I mean, I am not going to crawl 
for 100 days, I need it finish it's job.
Dennis

--- On Tue, 9/28/10, Markus Jelsma <[email protected]> wrote:

From: Markus Jelsma <[email protected]>
Subject: Re: crawl www
To: "Dennis" <[email protected]>
Cc: [email protected]
Date: Tuesday, September 28, 2010,
 9:16 PM

Oh, you don't need to crawl-urlfilter.txt. It's being used by the crawl 
command only and if you're about to crawl the internet (!), you will need the 
steps i explained in
 the other e-mail. You can forget about the crawl command 
in this case.


On Tuesday 28 September 2010 14:58:32 Dennis wrote:
> Sorry for interrupting, Markus,
> 
> But I'm not quite understand. How do I "update your DB's"?, What should I
>  do about "crawl-urlfilter.txt"? Thanks
> 
> 
> Dennis
> 
> --- On Tue, 9/28/10, Markus Jelsma <[email protected]> wrote:
> 
> From: Markus Jelsma <[email protected]>
> Subject: Re: crawl www
> To: [email protected]
> Date: Tuesday, September 28, 2010, 8:19 PM
> 
> Dennis, you shouldn't hyjack my
 thread ;)
> 
> Anyway. it's all about crawl, update your DB's and recrawl and keep
>  repeating the same loop over and over.
> 
> Cheers,
> 
> On Tuesday 28 September 2010 10:08:00 Dennis wrote:
> > Hi, all,
> > I want to crawl the whole www, how do I config "crawl-urlfilter.txt"?It
> >  used to be:# accept hosts in
> >  MY.DOMAIN.NAME+^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ ThanksDennis
> 
> Markus Jelsma - Technisch Architect - Buyways BV
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
> 

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 /
 06-50258350




      

Reply via email to