Hi Steve,
the crawl-urlfilter is for intranet crawling while
regex-urlfilter is for internet crawling.

Kind regards,
Olaf



On Thu, 31 Mar 2005 12:01:19 +0800, Steve Follmer <[EMAIL PROTECTED]> wrote:
> 
> What's the difference between crawl-urlfilter.txt and
> regex-urlfilter.txt?
> They look very similar. Why does nutch have both, and what do they do
> different?
> 
> My best guess is that the first is used only by the crawl tool and the
> second
> is used by nutch proper. The crawl tool and nutch proper seem to also
> have
> separate .xml config files. I further guess that this is just an
> artifact of
> having two separate tools that need separate but equal configuration?
> 
> -Poindexter
> 
> 


-- 

<SimpleHuman gender="male">
   <Physical name="Olaf Thiele" />
   <Virtual adress="http://www.olafthiele.de"; />
</SimpleHuman>

Reply via email to