Re: Clustering Demo

Andrzej Bialecki Thu, 08 May 2008 08:28:36 -0700

Grant Ingersoll wrote:

Anyone have any sample code or demo of running the clustering over alarge collection of documents that they could share? Mainly looking foran example of taking some corpus, converting it into the appropriateMahout representation and then running either the k-means or the canopyclustering on it.

It would be way cool to do this with the industry standard 20 newsgroupscorpus - there have been many experiments and evaluations of thiscorpus, so it's good as a baseline.



--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Clustering Demo

Reply via email to