Interesting... I am looking for some "data mining" concept, I found http://opennlp.org, "Natural Language Processing"...
Information classification, finding new language terms/tokens such as "IBM T42p", "SuSE Linux 10.0", "Red Rouge", "Break Barrel", etc... -----Original Message----- From: Otis No, LingPipe is a different beast not to be compared with Nutch nor Lucene. It doesn't "index" anything in the Lucene sense, although it does create certain in-memory or on-disk language models. The authors are very smart guys! Oh, ali LingPipe was described in Lucene in Action's Case Study chapter, along with Nutch, and others. Otis ----- Original Message ---- From: Fuad Another interesting tool to perform linguistic analysis on natural language data: http://www.alias-i.com/lingpipe/ - is it really "indexing" engine? They are using NekoHTML parser.
