Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "AboutPlugins" page has been changed by SebastianNagel:
https://wiki.apache.org/nutch/AboutPlugins?action=diff&rev1=11&rev2=12

Comment:
fix URL path to Apidocs

  Nutch's plugin system is based on the one used in 
[[http://www.eclipse.org/articles/Article-Plug-in-architecture/plugin_architecture.html|Eclipse
 2.x]].  Plugins are central to how Nutch works.  All of the parsing, indexing 
and searching that Nutch does is actually accomplished by various plugins.
  
- In writing a plugin, you're actually providing one or more ''extensions'' of 
the existing ''extension-points'' . The core Nutch ''extension-points'' are 
themselves defined in a plugin, the 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/plugin/ExtensionPoint.html|NutchExtensionPoints]]
 plugin (they are listed in the !NutchExtensionPoints 
[[http://svn.apache.org/viewcvs.cgi/nutch/trunk/src/plugin/nutch-extensionpoints/plugin.xml?view=markup|plugin.xml]]
 file). Each ''extension-point'' defines an interface that must be implemented 
by the ''extension''. The core extension points are:
+ In writing a plugin, you're actually providing one or more ''extensions'' of 
the existing ''extension-points'' . The core Nutch ''extension-points'' are 
themselves defined in a plugin, the 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/plugin/ExtensionPoint.html|NutchExtensionPoints]]
 plugin (they are listed in the !NutchExtensionPoints 
[[http://svn.apache.org/viewcvs.cgi/nutch/trunk/src/plugin/nutch-extensionpoints/plugin.xml?view=markup|plugin.xml]]
 file). Each ''extension-point'' defines an interface that must be implemented 
by the ''extension''. The core extension points are:
  
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/indexer/IndexWriter.html|IndexWriter]]
 -- Writes crawled data to a specific indexing backends (Solr, ElasticSearch, a 
CVS file, etc.).
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/indexer/IndexWriter.html|IndexWriter]]
 -- Writes crawled data to a specific indexing backends (Solr, ElasticSearch, a 
CVS file, etc.).
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/indexer/IndexingFilter.html|IndexingFilter]]
 -- Permits one to add metadata to the indexed fields. All plugins found which 
implement this extension point are run sequentially on the parse (from javadoc).
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/indexer/IndexingFilter.html|IndexingFilter]]
 -- Permits one to add metadata to the indexed fields. All plugins found which 
implement this extension point are run sequentially on the parse (from javadoc).
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/parse/Parser.html|Parser]]
 -- Parser implementations read through fetched documents in order to extract 
data to be indexed.  This is what you need to implement if you want Nutch to be 
able to parse a new type of content, or extract more data from currently 
parseable content.
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/parse/Parser.html|Parser]]
 -- Parser implementations read through fetched documents in order to extract 
data to be indexed.  This is what you need to implement if you want Nutch to be 
able to parse a new type of content, or extract more data from currently 
parseable content.
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/parse/HtmlParseFilter.html|HtmlParseFilter]]
 -- Permits one to add additional metadata to HTML parses (from javadoc).
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/parse/HtmlParseFilter.html|HtmlParseFilter]]
 -- Permits one to add additional metadata to HTML parses (from javadoc).
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/protocol/Protocol.html|Protocol]]
 -- Protocol implementations allow Nutch to use different protocols (ftp, http, 
etc.) to fetch documents.
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/protocol/Protocol.html|Protocol]]
 -- Protocol implementations allow Nutch to use different protocols (ftp, http, 
etc.) to fetch documents.
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/net/URLFilter.html|URLFilter]]
 -- URLFilter implementations limit the URLs that Nutch attempts to fetch.  The 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/net/RegexURLFilter.html|RegexURLFilter]]
 distributed with Nutch provides a great deal of control over what URLs Nutch 
crawls, however if you have very complicated rules about what URLs you want to 
crawl, you can write your own implementation.
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/net/URLFilter.html|URLFilter]]
 -- URLFilter implementations limit the URLs that Nutch attempts to fetch.  The 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/net/RegexURLFilter.html|RegexURLFilter]]
 distributed with Nutch provides a great deal of control over what URLs Nutch 
crawls, however if you have very complicated rules about what URLs you want to 
crawl, you can write your own implementation.
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/net/URLNormalizer.html|URLNormalizer]]
 -- Interface used to convert URLs to normal form and optionally perform 
substitutions.
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/net/URLNormalizer.html|URLNormalizer]]
 -- Interface used to convert URLs to normal form and optionally perform 
substitutions.
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/scoring/ScoringFilter.html|ScoringFilter]]
 -- A contract defining behavior of scoring plugins. A scoring filter will 
manipulate scoring variables in CrawlDatum and in resulting search indexes. 
Filters can be chained in a specific order, to provide multi-stage scoring 
adjustments. 
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/scoring/ScoringFilter.html|ScoringFilter]]
 -- A contract defining behavior of scoring plugins. A scoring filter will 
manipulate scoring variables in CrawlDatum and in resulting search indexes. 
Filters can be chained in a specific order, to provide multi-stage scoring 
adjustments. 
-  * 
[[http://nutch.apache.org/apidocs-1.8/org/apache/nutch/segment/SegmentMergeFilter.html|SegmentMergeFilter]]
 -- Interface used to filter segments during segment merge. It allows filtering 
on more sophisticated criteria than just URLs. In particular it allows 
filtering based on metadata collected while parsing page. 
+  * 
[[http://nutch.apache.org/apidocs/apidocs-1.8/org/apache/nutch/segment/SegmentMergeFilter.html|SegmentMergeFilter]]
 -- Interface used to filter segments during segment merge. It allows filtering 
on more sophisticated criteria than just URLs. In particular it allows 
filtering based on metadata collected while parsing page. 
  
  
- Updated to [[http://nutch.apache.org/apidocs-1.8/index.html | Nutch apidocs 
version 1.8]]
+ Updated to [[http://nutch.apache.org/apidocs/apidocs-1.8/index.html | Nutch 
apidocs version 1.8]]
  
  == Source Files ==
  

Reply via email to