Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JakeVanderdray: http://wiki.apache.org/nutch/PluginCentral ------------------------------------------------------------------------------ * languageidentifier - Adds a lang field to the index. * ontology * parse-ext - * parse-html - * parse-js - * parse-mp3 - * parse-msword - * parse-pdf - * parse-rss - * parse-rtf - * parse-text - * protocol-file - * protocol-ftp - * protocol-http - * protocol-httpclient - * query-basic - * query-more - * query-site - * query-url + * parse-html - Parses HTML documents + * parse-js - Parses JavaScript documents + * parse-mp3 - Parses MP3s + * parse-msword - Parses MS Word documents + * parse-pdf - Parses PDFs + * parse-rss - Parses RSS feeds + * parse-rtf - Parses RTF files + * parse-text - Parses text documents + * protocol-file - Retreives documents from the filesystem + * protocol-ftp - Retreives documents through ftp + * protocol-http - Retreives documents through http + * protocol-httpclient - Also retreives documents through http (How does it differ from protocol-http?) + * query-basic - Runs queries against content, url and anchor fields + * query-more - Runs queries against date, content-length, contentType, primaryType and subType fields. + * query-site - Runs queries against site field + * query-url - Runs queries against url field. * urlfilter-prefix * urlfilter-regex
