Another option would be to add a new command line option to the
ClusterDumper to produce the abbreviated output you desire. Then you
could submit it as a patch and everybody could benefit. Off hand, this
seems like a useful output representation.
Jeff
On 5/24/13 6:57 AM, Rajesh Nikam wrote:
Hi Willem,
Not aware of some tools that come out of the box.
Easier way would be write some script ( e.g. perl ) to parse this cluster
dump and separate out instances for each cluster.
Thanks,
Rajesh
On Fri, May 24, 2013 at 4:02 PM, Willem Conradie [ MTN – Innovation Centre
] <[email protected]> wrote:
Hi, can somebody please assist with my question below?
Regards,
Willem
From: Willem Conradie [ MTN – Innovation Centre ]
Sent: Wednesday, 15 May 2013 07:57 AM
To: [email protected]
Subject: Mahout Cluster attributes
Hi,
After running a clustering process is there a way to just retrieve the
cluster attributes (clusterid, number of records, centroid and radius eg.
CL-12124070{ n=4559664 c=[8.470, 68.606] r=[6.303, 49.436]}) ,without doing
a full cluster dump. Using ‘clusterdump’ provides this but then also adds
the weights and points, which in my case is unwanted.
Regards,
Willem
________________________________
NOTE: This e-mail message is subject to the MTN Group disclaimer see
http://www.mtn.co.za/SUPPORT/LEGAL/Pages/EmailDisclaimer.aspx