I'm using Mahout's clustering capabilities. Not sure if that answers your question about what part I'm using.

I'm interested in exporting the learned clusters and the cluster assignments.
Is there a csv exporter for cluster output?

I tried using the clusterdump utility to output a text file which I would parse, but it looks like clusterdump is trying to load all the vectors into memory and thats causing an out of memory issue since my data is too large.

J

Quoting Ted Dunning <[email protected]>:

Which part of Mahout are you exporting from?

Some areas have serialization formats that you could use.  Some others have
JSON export capabilities.

For R, I generally use CSV export, but I usually am involved with
classifiers.

On Thu, Feb 24, 2011 at 5:51 PM, <[email protected]> wrote:

Hi,

I'm wondering how people export the results of mahout processing (e.g the
results of kmeans) for analysis in a high level language like Matlab,
Python, R, etc...

I work in python (cpython not jython) and I'm curious how other people
approach this problem.

I'm getting ready to write my own exporter which writes the clustering
output to an HDF5 file which I can read in python. Before I did this I was
wondering if there was a simple, existing solution that people were already
using.

Jeremy






Reply via email to