> >  I am working on building custom analyzer

To build a custom analyzer, take a look at analysis-de and analysis-fr
plugins
(they use some lucene analyzers).
A specific analyzer is used depending on the language guessed by the
language identifier.


> and language detector
> > for native language("Marathi") , does anybody have idea how to extend
> > nutch for using this language.

Use the
org.apache.nutch.analysis.lang.NGramProfile command to generate a profile of
ngrams for Marathi from a textual corpus.
Usage for creating a new profile is:
NGramProfile -create profilename filename encoding

Regards

Jérôme


--
http://motrech.free.fr/
http://www.frutch.org/

Reply via email to