Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JakeVanderdray:
http://wiki.apache.org/nutch/PluginCentral

------------------------------------------------------------------------------
   * languageidentifier - Adds a lang field to the index.
   * ontology
   * parse-ext
-  * parse-html
-  * parse-js
-  * parse-mp3
-  * parse-msword
-  * parse-pdf
-  * parse-rss
-  * parse-rtf
-  * parse-text
-  * protocol-file
-  * protocol-ftp
-  * protocol-http
-  * protocol-httpclient
-  * query-basic
-  * query-more
-  * query-site
-  * query-url
+  * parse-html - Parses HTML documents
+  * parse-js - Parses JavaScript documents
+  * parse-mp3 - Parses MP3s
+  * parse-msword - Parses MS Word documents
+  * parse-pdf - Parses PDFs
+  * parse-rss - Parses RSS feeds
+  * parse-rtf - Parses RTF files
+  * parse-text - Parses text documents
+  * protocol-file - Retreives documents from the filesystem
+  * protocol-ftp - Retreives documents through ftp
+  * protocol-http - Retreives documents through http
+  * protocol-httpclient - Also retreives documents through http (How does it 
differ from protocol-http?)
+  * query-basic - Runs queries against content, url and anchor fields
+  * query-more - Runs queries against date, content-length, contentType, 
primaryType and subType fields.
+  * query-site - Runs queries against site field
+  * query-url - Runs queries against url field.
   * urlfilter-prefix
   * urlfilter-regex
  

Reply via email to