I activate carrot-clustering plugin on Nutch today. Plugin is located by default in this directory \nutch-0.9\WEB-INF\classes\plugins\clustering-carrot2
Since it was downloaded with Nutch 0.9 I am sure if it is not a latest version. Plugin itself is 1.0.2, I am not sure if new version is out since 2.1 carrot is out. To unable clustering do the following. Instruction - http://wiki.apache.org/nutch/ClusteringPlugin I did the following, got to your nutch-0.9\WEB-INF\classes (nutch-0.9 it is a root directory of your installation) Find nutch-site.xml file and make the following change for plugin.includes property find <value> tag add the following at the end of existing value |clustering-carrot2 see example below <property> <name>plugin.includes</name> <value>protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)|clustering-carrot2</value> <description>Regular expression naming plugin directory names to include. Any plugin not matching this expression is excluded. In any case you need at least include the nutch-extensionpoints plugin. By default Nutch includes crawling just HTML and plain text via HTTP, and basic indexing and search plugins. In order to use HTTPS please enable protocol-httpclient, but be aware of possible intermittent problems with the underlying commons-httpclient library. </description> </property> Go to nutch web application (I assume that it is configured and working), type search criteria, check clustering option near search button, click search, you will see clustering on right hand site. Hope it helps Now I wonder if I can install the latest plugin for nutch since carrot 2.1 was released. Thank you Armen --- Emmanuel <[EMAIL PROTECTED]> wrote: > Interesting.... > > Could you give us more details to deploy it on > nutch? > > Do we only need to add the plugin ? or have we some > config files ? > > > > Hi All, > > > > A bit of self-promotion again :) I hope you don't > find it out of topic, > > after all, some folks are using Carrot2 with > Lucene and Solr, and Nutch > > has > > a Carrot2-based clustering plugin. > > > > Staszek > > [EMAIL PROTECTED] > > > > > ________________________________________________________________________________ > > > > Carrot2 Search Results Clustering Engine version > 2.1 released > > > > Version 2.1 of the Java-based Open Source Search > Results Clustering Engine > > called Carrot2 has been released. Carrot2 can > fetch search results from a > > variety of sources and automatically organize > (cluster) them into thematic > > categories using one of its specialized search > results clustering > > algorithms. > > > > The 2.1 release comes with the Document Clustering > Server that exposes > > Carrot2 clustering as an XML-RPC or REST service > with convenient XML or > > JSON > > data formats enabling e.g. quick PHP, .NET or Ruby > integration. The new > > release also adds new search results sources and > many other improvements ( > > > http://project.carrot2.org/release-2.1-notes.html). > > > > At the same time Carrot Search, the Carrot2 > spin-off company, released > > version 1.2 of Lingo3G -- a high-performance > document clustering engine > > offering hierarchical clustering, synonyms, label > filtering and advanced > > tuning capabilities. > > > > For more information, please check > > > > Carrot2 live demo -- http://www.carrot2.org > > Carrot2 project website -- > http://project.carrot2.org > > Release 2.1 notes -- > http://project.carrot2.org/release-2.1-notes.html > > > > Carrot Search -- http://www.carrot-search.com > > > ____________________________________________________________________________________ Got a little couch potato? Check out fun summer activities for kids. http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz
