Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JeromeCharron:
http://wiki.apache.org/nutch/JeromeCharron

------------------------------------------------------------------------------
     * Some benchs LanguageIdentifierBenchs
     * Enhance the LanguageParseFilter by checking the validity of the parsed 
language string.
     * '''TODO''': Enhance the LanguageParseFilter by correlating (instead of 
taking only the first information available) all the clues available : 
DublinCore / Meta-Http-Equiv / Content-Language and statistical content 
analysis.
+    * '''TODO''': Improve API :
-    * '''TODO''': Improve API by returning an ordered list of candidate 
languages instead of just one.
+      * returns an ordered list of candidate languages instead of just one.
+      * See also Andrzej 
[http://www.nabble.com/Re%3A-lang-identifier-and-nutch-analyzer-in-trunk-p2533535.html
 comments] :
+        * exporting a list of supported languages,
+        * exporting an NGramProfile of the analyzed text,
+        * allow processing of chunks of input.
   * MultiLingualSupport proposal.
     * Framework for a multi-lingual analysis:
       * Analysis ExtensionPoint

Reply via email to