In case you are sure that the plugin is deployed successfully (check the logs in the very beginning there is a plugin included section) than there is nothing to do for you. What happens behind the sense is that until segment generation the plugin is asked if a specific url can added to the segment's fetch list. This happens only in case the url pass ALL (!!) url filters that are deployed (see logfile) so verify that you do not blog a url with a given regular expression in a deployed reg-ex url filter.

HTH
Stefan


Am 01.12.2005 um 06:44 schrieb Brent Parker:

Greetings,

I'm a Nutch (0.7.1) newbie. I have installed it - used the Intranet crawl, and all works fine. I want to crawl the web, using a relatively small list of domains. Therefore, I am interested in using the urlfilter-db plugin (http://issues.apache.org/jira/browse/NUTCH-100). I have downloaded the
plugin. I was able to build and deploy with no problem. I set up the
nutch-default.xml, nutch-site.xml, and mysql as specified in the plugin
instructions. But how do I use (invoke) the plugin?

I am using the tutorial (http://lucene.apache.org/nutch/ tutorial.html) as my
guide to do whole-web crawling.  Do I now start from the "Whole-web:
Fetching" section?

Just need a "little" guidance (I think).

Thanks in advance!
Brent



Reply via email to