Gaurang,

You can fetch documents from Nutch indexes (which are Lucene indexes) and then feed them to the clustering algorithm directly, as explained in Carrot2 examples here:

http://download.carrot2.org/head/manual/index.html#section.integration

There are several examples you can choose to start from -- some of them accept raw data, some of them use Lucene document source.

http://fisheye3.atlassian.com/browse/carrot2/branches/stable/applications/carrot2-examples/src/org/carrot2/examples/clustering

If you need ultimate flexibility, go with the raw-data example:

http://fisheye3.atlassian.com/browse/carrot2/branches/stable/applications/carrot2-examples/src/org/carrot2/examples/clustering/ClusteringDocumentList.java?r=3345

Dawid


Gaurang Patel wrote:
Hi all,

Can anyone know how can I use the nutch crawled results for clustering them
with Carrot2 clustering engine? What I want is different from Carrot2
clustering plugin that comes with nutch. I want to write my own code for
retrieving document list from nutch crawled results, and then want to supply
this list to the Carrot2 algorithm.

Any kind of quick help will be appriciated.

Regards,
Gaurang

Reply via email to