[Nutch-general] urlfilter-db usage

Brent Parker Wed, 30 Nov 2005 21:16:03 -0800

Greetings,

I'm a Nutch (0.7.1) newbie.  I have installed it - used the Intranet crawl,
and all works fine. I want to crawl the web, using a relatively small list
of domains. Therefore, I am interested in using the urlfilter-db plugin
(http://issues.apache.org/jira/browse/NUTCH-100). I have downloaded the
plugin. I was able to build and deploy with no problem. I set up the
nutch-default.xml, nutch-site.xml, and mysql as specified in the plugin
instructions. But how do I use (invoke) the plugin?


I am using the tutorial (http://lucene.apache.org/nutch/tutorial.html) as my
guide to do whole-web crawling.  Do I now start from the "Whole-web:
Fetching" section?

Just need a "little" guidance (I think).

Thanks in advance!
Brent



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] urlfilter-db usage

Reply via email to