Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JakeVanderdray:
http://wiki.apache.org/nutch/WritingPlugins

------------------------------------------------------------------------------
   * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/protocol/Protocol.html 
Protocol] -- Protocol implementations allow nutch to use different protocols 
(ftp, http, etc.) to fetch documents.
   * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/net/URLFilter.html 
URLFilter] -- URLFilter implementations limit the URLs that nutch attempts to 
fetch.  The 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/net/RegexURLFilter.html
 RegexURLFilter] distributed with Nutch provides a great deal of control over 
what URLs Nutch crawls, however if you have very complicated rules about what 
URLs you want to crawl, you can write your own implementation.
  
+ == Setup ==
+ 
+ You need to start by [http://www.apache.org/dev/version-control.html#anon-svn 
downloading] the Nutch source code.  Once you've got that make sure it compiles 
as is before you make any changes.
+ 
  
  <<< PluginCentral
  

Reply via email to