On 9/4/07, martin <[EMAIL PROTECTED]> wrote: > Hi all: > I'm fresh to nutch.Can i use my own analyzer , which is based on > lucene to build index and search instead nutch default analyzer.write a new > plugin? or nutch support such as config file?I haven't find any useful > document right now:(
Nutch supports different analyzers, but it is a bit limited. By default, documents are analyzed with NutchDocumentAnalyzer. If an analysis plugin is enabled (such as analysis-fr), and a document is written a language specified by an analysis plugin, that document is analyzed by the plugin instead of NutchDocumentAnalyzer. For example, analysis-fr is for analyzing French documents so if a document is in French (probably recognized by language-identifier) then that document is analyzed by analysis-fr, instead of default analyzer. So, you can either define a new analyzer plugin or change NutchDocumentAnalyzer to process documents with a different analyzer. (PS:As I said before, this scheme is a bit limited and it is actually one of the things that we want to improve. ) > -- Doğacan Güney
