Hello,

In regex-urlfilter.txt (or crawl-urlfiltter.txt if you crawl).
-.
+^http:/intranet/development/pdffiles/

Make sure, the urlfilter-regex plugin is incuded in nutch-site.xml or in nutch-default.xml.

Regards,
Ferenc

Clint Cagle wrotte:

How do I enable nutch only to search one directory on an intranet?

For example,
http:/intranet/development/pdffiles/





-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to