Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "WhyNutchHasAPluginSystem" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/WhyNutchHasAPluginSystem?action=diff&rev1=11&rev2=12 - -- Originally written by !StefanGroschupf - 05 Oct 2004 + Originally written by !StefanGroschupf - 05 Oct 2004, however the ethos and legacy behind the plugin system are still relevant. ''This text explain the ideas behind the nutch plugin system.'' @@ -8, +8 @@ == Extensibility == - Plugins allow anyone to extend the functionality of Nutch simply by writing their own implementation of a given interface. For instance, the [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/parse/msword/MSWordParser.html|MSWordParser]], used for parseing Word documents, is an implementation of the [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/parse/Parser.html|Parser]] interface. + Plugins allow anyone to extend the functionality of Nutch simply by writing their own implementation of a given interface. For instance, the HTMLParser, used for parsing HTML documents, is an implementation of the Parser interface. == Flexibility == - Since everybody can write a plugin, hopefully in future there will be a large set of plugins to choose from. At that point Nutch administrators will each be able to assemble their own search engine based on her/his particular needs needs by installing the plugins he or she is interested in. He or she will be able to choose from different summarizing algorithms, add pdf file format or remove ftp protocol support. + Since everybody can write a plugin, hopefully in future there will be a large set of plugins to choose from. Advances in the adaptability of the TIKA plugin to deal with many common file types has permitted the removal of a lot of clutter from the plugins which are distributed with Nutch. This allows Nutch administrators to assemble their own search engine based on her/his particular needs needs by installing the plugins he or she is interested in with little or no hastle. Administrators are also able to choose from different summarizing algorithms, add pdf file format or remove ftp protocol support. == Maintainability ==

