Greetings, I'm a Nutch (0.7.1) newbie. I have installed it - used the Intranet crawl, and all works fine. I want to crawl the web, using a relatively small list of domains. Therefore, I am interested in using the urlfilter-db plugin (http://issues.apache.org/jira/browse/NUTCH-100). I have downloaded the plugin. I was able to build and deploy with no problem. I set up the nutch-default.xml, nutch-site.xml, and mysql as specified in the plugin instructions. But how do I use (invoke) the plugin?
I am using the tutorial (http://lucene.apache.org/nutch/tutorial.html) as my guide to do whole-web crawling. Do I now start from the "Whole-web: Fetching" section? Just need a "little" guidance (I think). Thanks in advance! Brent ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
