> > I am working on building custom analyzer
To build a custom analyzer, take a look at analysis-de and analysis-fr
plugins
(they use some lucene analyzers).
A specific analyzer is used depending on the language guessed by the
language identifier.
> and language detector
> > for native language("Marathi") , does anybody have idea how to extend
> > nutch for using this language.
Use the
org.apache.nutch.analysis.lang.NGramProfile command to generate a profile of
ngrams for Marathi from a textual corpus.
Usage for creating a new profile is:
NGramProfile -create profilename filename encoding
Regards
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/