Hello,
i have never used Mahout before and before i invest too much time
reading api and source code i thought that i maybe get some pointers
from you.
I have several Objects containing 1..n attributes (actually long/double
values). I want to cluster these Objects to get Clusters of similar
Objects regardings those n attributes.
Then i want to be able to look up in which cluster my object is and
which other objects also belong to this cluster.
I thought that such a clustering would be possible using the Mean Shift
from Mahout (since i don't know how many clusters i will have in
advance, else i would probably use k-means).
So what i have to do is transform these Objects to VectorS and then
cluster them using MeanShiftCanopy and some distance measure (probably
EuclideanDistanceMeasure at the beginning).
foo = new DenseVector(new double[]{ val1, ..., valn});
and then basically follow what is done in testReferenceImplementation()
of the DisplayMeanShift class (My entry point is the DisplayMeanShift
class so far.).
Is that correct? Is there any other example doing something similar i
could look at?
Any additional pointers are welcome - i already read the IBM article
from Grant Ingersoll.
regards
Christoph Hermann