I have a local version which I have submitted long back and I am using
it on real data and is not giving same point for all clusters. However,
I haven't tried with latest mahout code. I have kept my code to output
data as text so that it is easy for me to verify. However, current
mahout code outputs it as binary data (as sequencefile). So, it is
difficult to verify.
Thanks
Pallavi
Robin Anil wrote:
Have you verified the trunk code on some real data. I am getting same point
for all clusters regardless of the distnce measure
Robin
On Wed, Feb 17, 2010 at 6:41 PM, Pallavi Palleti <
pallavi.pall...@corp.aol.com> wrote:
Yes. It shouldn't be a problem. My point was that we are extending
numpoints as part of ClusterBase, though we are not using it in SoftCluster.
Other that that, I don't see any issue w.r.t. functionality.
Thanks
Pallavi
Robin Anil wrote:
In the impl of SoftClusters on writeOut it calculates the centroid and
writes it and when read(in) it reads the centroid in to the center.
In ClusterDumper it reads into the ClusterBase and does value.getCenter();
It should work normally right
Robin
On Wed, Feb 17, 2010 at 6:02 PM, Pallavi Palleti <
pallavi.pall...@corp.aol.com> wrote:
Yes. But not the total number of points. So, the numpoints from
ClusterBase
will not be used in SoftCluster. numpoints is specific to Kmeans similar
to
weightedpoint total for fuzzy kmeans.
Robin Anil wrote:
the center is still the averaged out centroid right?
weightedtotalvector/totalprobWeight
On Wed, Feb 17, 2010 at 5:10 PM, Pallavi Palleti <
pallavi.pall...@corp.aol.com> wrote:
I haven't yet gone thru ClusterDumper. However, ClusterBase would be
having
number of points to average out (pointTotal/numPoints as per kmeans)
where
as SoftCluster will have weighted point total. So, I am wondering how
can
we
reuse ClusterBase here?
Thanks
Pallavi
Robin Anil wrote:
yes. So that cluster dumper can print it out.
On Wed, Feb 17, 2010 at 5:02 PM, Pallavi Palleti <
pallavi.pall...@corp.aol.com> wrote:
Hi Robin,
when you meant by reusing ClusterBase, are you planning to extend
ClusterBase in SoftCluster? For example, SoftCluster extends
ClusterBase?
Thanks
Pallavi
Robin Anil wrote:
I have been trying to convert FuzzyKMeans SoftCluster(which should
be
ideally be named FuzzyKmeansCluster) to use the ClusterBase.
I am getting* the same center* for all the clusters. To aid the
conversion
all i did was remove the center vector from the SoftCluster class
and
reuse
the same from the ClusterBase. These are essentially making no
change
in
the
tests which passes correctly.
So I am questioning whether the implementation keeps the average
center
at
all ? Anyone who has used FuzzyKMeans experiencing this?
Robin