Hi, On Fri, Aug 7, 2009 at 12:31 PM, Andrzej Bialecki<a...@getopt.org> wrote: > .. and a Nutch plugin with similar functionality: > > http://lucene.apache.org/nutch/apidocs-1.0/org/apache/nutch/analysis/lang/LanguageIdentifier.html
See also TIKA-209 [1] where I'm currently integrating the Nutch code to work with Tika. Tika 0.5 will have built-in language detection based on this. [1] https://issues.apache.org/jira/browse/TIKA-209 BR, Jukka Zitting