Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by johnroman: http://wiki.apache.org/nutch/PluginCentral ------------------------------------------------------------------------------ * WritingPluginExample - A step-by-step example of how to write a plugin for the 0.7 branch. - updated by LucasBoullosa * [http://wiki.media-style.com/display/nutchDocu/Write+a+plugin Writing Plugins] - by Stefan - == Plugins that Come with Nutch (0.7) == + == Plugins that Come with Nutch (0.9) == In order to get Nutch to use any of these plugins, you just need to edit your conf/nutch-site.xml file and add the name of the plugin to the list of plugin.includes. @@ -24, +24 @@ * '''parse-html''' - Parses HTML documents * '''parse-js''' - Parses Java``Script * '''parse-mp3''' - Parses MP3s + * '''parse-zip''' - Parses ZIP archives + * '''parse-mspowerpoint''' - Parses Microsoft Powerpoint files * '''parse-msword''' - Parses MS Word documents + * '''parse-msexcel''' - Parses MS Excel documents * '''parse-pdf''' - Parses PDFs * '''parse-rss''' - Parses RSS feeds + * '''parse-oo''' - Parses OpenOffice files + * '''parse-swf''' - Parses Shockwave Flash * '''parse-rtf''' - Parses RTF files * '''parse-text''' - Parses text documents * '''protocol-file''' - Retreives documents from the filesystem @@ -47, +52 @@ * '''lib-commons-httpclient''' * '''lib-http''' * '''lib-jakarta-poi''' - * '''lib-log4j''' + * '''lib-log4j''' - * '''lib-lucene-analyzers''' + * '''lib-lucene-analyzers''' - Lucene analyzers - * '''lib-nekohtml''' - * '''lib-parsems''' + * '''lib-nekohtml''' - automatic tag balancer + * '''lib-parsems''' - parse ms documents framework * '''parse-msexcel''' - Parses MS Excel documents * '''parse-mspowerpoint''' - Parses MS Powerpoint documents * '''parse-oo''' - Parses Open Office and Star Office documents (Extentsions: ODT, OTT, ODH, ODM, ODS, OTS, ODP, OTP, SXW, STW, SXC, STC, SXI, STI)