What's the difference between crawl-urlfilter.txt and regex-urlfilter.txt? They look very similar. Why does nutch have both, and what do they do different?
My best guess is that the first is used only by the crawl tool and the second is used by nutch proper. The crawl tool and nutch proper seem to also have separate .xml config files. I further guess that this is just an artifact of having two separate tools that need separate but equal configuration? -Poindexter
