It's pretty simple. You only need to enable this plugin in the
nutch-site.xml file
<property>
<name>plugin.includes</name>
<value>protocol-httpclient|urlfilter-regex|parse-(text|html|pdf-mod|rtf|msword|
msexcel|mspowerpoint)|index-(basic|anchor|more)|query-(basic|site|url|more)|summ
ary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)|*clustering-carrot2*
</value>
</property>
and
<!-- Carrot2 clustering properties -->
<property>
<name>extension.clustering.carrot2.defaultLanguage</name>
<value>en</value>
<description>Two-letter ISO code of the language.
http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt
</description>
</property>
For more information refer to
http://wiki.apache.org/nutch/ClusteringPlugin
Alexander
2008/12/2 Miao Jiang <[EMAIL PROTECTED]>
> Hi all,
>
> I am now trying to use the plugin of Carrot2. However, I have no idea of
> how
> to modify the nutch-site.xml and I did not find any answer online. Is there
> anyone who has used this plugin?
>
> Thanks.
> Alvin
>
--
Best Regards
Alexander Aristov