Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JeromeCharron:
http://wiki.apache.org/nutch/WritingPlugins

The comment on the change is:
Complete the list of nutch-core extension points.

------------------------------------------------------------------------------
  == Introduction ==
  
- Writing a plugin allows you to extend and change Nutch without having to 
modify the core system.  In writing a plugin, you're actually writing an 
implementation of one of the following Nutch interfaces (Please update this 
list with any I've missed):
+ Writing a plugin allows you to extend and change Nutch without having to 
modify the core system.  In writing a plugin, you're actually providing one or 
more ''extension'' of the existing ''extension-points'' . The core Nutch 
''extension-points'' are themselves defined in a plugin, the 
NutchExtensionPoints plugin (they are listed in the NutchExtensionPoints 
[http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/plugin/nutch-extensionpoints/plugin.xml?view=markup
 plugin.xml] file). Each ''extension-point'' define an interface that must be 
implemented by the ''extension''. Nutch core extension points are (Please 
update this list with any I've missed):
  
+  * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/clustering/OnlineClusterer.html
 OnlineClusterer] -- An extension point interface for online search results 
clustering algorithms (from javadoc).
+  * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/indexer/IndexingFilter.html
 IndexingFilter] -- Permits one to add metadata to the indexed fields. All 
plugins found which implement this extension point are run sequentially on the 
parse (from javadoc).
+  * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/ontology/Ontology.html 
Ontology]
   * [http://lucene.apache.org/nutch/apidocs/org/apache/nutch/parse/Parser.html 
Parser] -- Parser implementations read through fetched documents in order to 
extract data to be indexed.  This is what you need to implement if you want 
Nutch to be able to parse a new type of content, or extract more data from 
currently parseable content.
+  * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/parse/HtmlParseFilter.html
 HtmlParseFilter] -- Permits one to add additional metadata to HTML parses 
(from javadoc).
   * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/protocol/Protocol.html 
Protocol] -- Protocol implementations allow nutch to use different protocols 
(ftp, http, etc.) to fetch documents.
+  * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/QueryFilter.html
 QueryFilter] -- Extension point for query translation. Permits one to add 
metadata to a query (from javadoc).
   * 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/net/URLFilter.html 
URLFilter] -- URLFilter implementations limit the URLs that nutch attempts to 
fetch.  The 
[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/net/RegexURLFilter.html
 RegexURLFilter] distributed with Nutch provides a great deal of control over 
what URLs Nutch crawls, however if you have very complicated rules about what 
URLs you want to crawl, you can write your own implementation.
+  * 
[http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/java/org/apache/nutch/analysis/NutchAnalyzer.java?view=markup
 NutchAnalyzer] -- An extension point that enables to provide some language 
specific analyzers (see MultiLingualSupport proposal). ''Since it is in 
development stage, it is not in released javadoc''.
  
  == Setup ==
  

Reply via email to