Would you tell me where i can get help document on How to use NGramProfile to train the language identifier and how to detect it.
Marathi language used in India. Uses Devanagari Script and also space is used for separator. Will it be OK if i use Stop Analyzer instead of NutchDocumentAnalyzer with my custom stopwords? where i have to make changes in Nutch code?
