Re: Nutch crawled results for Clustering with Carrot2

Dawid Weiss Thu, 07 May 2009 02:12:20 -0700


Gaurang,

You can fetch documents from Nutch indexes (which are Lucene indexes) and thenfeed them to the clustering algorithm directly, as explained in Carrot2 exampleshere:


http://download.carrot2.org/head/manual/index.html#section.integration

There are several examples you can choose to start from -- some of them acceptraw data, some of them use Lucene document source.


http://fisheye3.atlassian.com/browse/carrot2/branches/stable/applications/carrot2-examples/src/org/carrot2/examples/clustering

If you need ultimate flexibility, go with the raw-data example:

http://fisheye3.atlassian.com/browse/carrot2/branches/stable/applications/carrot2-examples/src/org/carrot2/examples/clustering/ClusteringDocumentList.java?r=3345

Dawid


Gaurang Patel wrote:

Hi all,

Can anyone know how can I use the nutch crawled results for clustering them
with Carrot2 clustering engine? What I want is different from Carrot2
clustering plugin that comes with nutch. I want to write my own code for
retrieving document list from nutch crawled results, and then want to supply
this list to the Carrot2 algorithm.

Any kind of quick help will be appriciated.

Regards,
Gaurang

Re: Nutch crawled results for Clustering with Carrot2

Reply via email to