Hi,

I was wondering whether someone might be able to help me out.  I'd like to
use Mahout via Elastic map Reduce to cluster some datasets but I'm not sure
I've got the right use case.  I'm hoping someone might be able to comment
and perhaps point me in the direction of some further advice.

I have a dataset which is stored in a database and structured as follows:

Item  Value X   Value Y  Value Z
A       2             4            3
A       3             5            6
A       6             7            9
B       5            8             2
B       2            4             7
...

I would like to create a series of clusters for each item based on the
values of X and Y and Z.  X and Y are geographic co-ordinates i.e. real
world places and Z is a value observed in those places.  What I'd like to
end up with is (for each Item) a series of clusters saying these Values of Z
are coincident at this place (represented by Value X and Y).  I've looked
through and played with the quickstarts and that's all fine but I'm
wondering:

1.  Is this sort of analysis possible?
2.  How I convert my numeric data into the correct format to be processed by
a Job
3.  Any pointers to how I might configure my job in a way that can be
distributed and create a cluster for each item

Thank you to anyone who might be able to help, I'm really excited to get
started with Mahout but I'm struggling to understand whether it's suitable
and how to get started.

Thanks very much,

John

Reply via email to