Hi,

I just registered as ckling at sourceforge and would like to register as 
a octave-forge developer . I would like to submit a bugfix for the 
kmeans.m of the statistics package:

In cases where clusters contain only one element, the mean() function 
for calculating the new center of the cluster will not return the true 
center, but will return the mean() of all coordinates of the single 
cluster element.

I fixed that issue and would like to submit the changes.
My changes also include the correction of two mistypings, additionally I 
renamed the variable index to idx (for not confusing it with the 
function index) and I removed the unnecessary check for 
length(mean(...)) in the check for empty clusters.

Here is the output of diff:

      ## Classify
      [tmp, classes] = min (D, [], 2);

-    ## Calcualte new centroids
+    ## Calculate new centroids
      for i = 1:k
        ## Check for empty clusters
-      if (sum (classes == i) ==0 || length (mean (data(classes == i, 
:))) == 0)
+      if (sum (classes == i) ==0)

          switch emptyaction
            ## if 'singleton', then find the point that is the
@@ -94,10 +94,11 @@
       endif ## end check for empty clusters

        ## update the centroids
-      centers(i, :) = mean (data(classes == i, :));
+      elements_class = data(classes == i, :);
+      centers(i, :) = sum(elements_class,1)/size(elements_class,1);
      endfor

-    ## calculate the differnece in the sum of distances
+    ## calculate the difference in the sum of distances
      err  = sumd - objCost (data, classes, centers);
      ## update the current sum of distances
      sumd = objCost (data, classes, centers);
@@ -112,11 +113,11 @@
      endfor
  endfunction

-function index = maxCostSampleIndex (data, centers)
+function idx = maxCostSampleIndex (data, centers)
    cost = 0;
-  for index = 1:rows (data)
-    if cost < sumsq (data(index,:) - centers)
-      cost = sumsq (data(index,:) - centers);
+  for idx = 1:rows (data)
+    if cost < sumsq (data(idx,:) - centers)
+      cost = sumsq (data(idx,:) - centers);
      endif
    endfor
  endfunction



Cheers,
Christoph

-- 
Dipl.-Inf. Christoph Carl Kling
WeST - Institute for Web Science and Technologies
University of Koblenz-Landau, Germany
http://west.uni-koblenz.de


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev

Reply via email to