Hi Steve, the crawl-urlfilter is for intranet crawling while regex-urlfilter is for internet crawling.
Kind regards, Olaf On Thu, 31 Mar 2005 12:01:19 +0800, Steve Follmer <[EMAIL PROTECTED]> wrote: > > What's the difference between crawl-urlfilter.txt and > regex-urlfilter.txt? > They look very similar. Why does nutch have both, and what do they do > different? > > My best guess is that the first is used only by the crawl tool and the > second > is used by nutch proper. The crawl tool and nutch proper seem to also > have > separate .xml config files. I further guess that this is just an > artifact of > having two separate tools that need separate but equal configuration? > > -Poindexter > > -- <SimpleHuman gender="male"> <Physical name="Olaf Thiele" /> <Virtual adress="http://www.olafthiele.de" /> </SimpleHuman>
