Is there an API doc or design doc that I can read to
understand where you are? Is the language plugin architecture
already in the main trunk?

The only available document is
http://wiki.apache.org/nutch/MultiLingualSupport
and sometimes I maintain this page
http://wiki.apache.org/nutch/JeromeCharron


Here are some issues that I've been worried about:
* Support of multilingual plugin?
** If one plugin can support more than one languages,
   the language needs to be passed at each analyzsis.

I don't understand your need.
But if you have an analysis plugin that can handle many languages, you
can simply define many implementations in your plugin xml, ie

<extension id="org.apache.nutch.analysis.cjk"
             name="CJKAnalyzer"
             point="org.apache.nutch.analysis.NutchAnalyzer">

     <implementation id="org.apache.nutch.analysis.cn.ChineseAnalyzer"
                     class="org.apache.nutch.analysis.cjk.CJKAnalyzer ">
       <parameter name="lang" value="cn"/>
     </implementation>

     <implementation id="org.apache.nutch.analysis.kr.KoreanAnalyzer"
                     class="org.apache.nutch.analysis.cjk.CJKAnalyzer">
       <parameter name="lang" value="kr"/>
     </implementation>

     <implementation id="org.apache.nutch.analysis.jp.JapaneseAnalyzer"
                     class="org.apache.nutch.analysis.cjk.CJKAnalyzer">
       <parameter name="lang" value="jp"/>
     </implementation>

  </extension>


** This assumes language identification is done before
   analysis.  Is it the case ?

Yes.


* Support of a different analyzer for query than index
** Analyzer for query may need to behave differently than
   analyzer for indexinging.  Can your architecture
   specify different analyzers for indexing and query?

In fact, to avoid adding a QueryAnalyser extension point,
the Query use the same Analyzer implementation that the one
for document analysis.

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to