Hi Arno
see the forwarded message from the octave-help mailing list. Søren has
the following function that he modified to be matlab compatible. Do
you think it could abe added to the statistics package?
Carnë
---------- Forwarded message ----------
From: Søren Hauberg <so...@hauberg.org>
Date: 9 December 2011 08:26
Subject: Re: K means.
To: Carnë Draug <carandraug+...@gmail.com>
Cc: Jordi Gutiérrez Hermoso <jord...@octave.org>, h...@octave.org,
Prachi Jain <prachijain...@gmail.com>
fre, 09 12 2011 kl. 08:00 +0100, skrev Søren Hauberg:
> My current version works like
>
> clusters = kmeans (data, initial_clusters);
>
> whereas Matlab's work like
>
> clusters = kmeans (data, number_of_clusters);
>
> I think we should at least match this before putting it into the
> statistics package. But otherwise, I see no problems with not having all
> the options available that Matlab has.
The attached version has a more compatible API. It's not perfect, but it
seems to work fairly well.
Søren
## Copyright (C) 2011 Soren Hauberg
##
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 3 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
function [classes, centers] = kmeans (data, k, varargin)
## Input checking
if (!ismatrix (data) || !isreal (data))
error ("kmeans: first input argument must be a DxN real data matrix");
endif
if (!isscalar (k))
error ("kmeans: second input argument must be a scalar");
endif
[N, D] = size (data);
## (so far) Harcoded options
maxiter = Inf;
start = "sample";
## Find initial clusters
switch (lower (start))
case "sample"
idx = randperm (N) (1:k);
centers = data (idx, :);
otherwise
error ("kmeans: unsupported initial clustering parameter");
endswitch
## Run the algorithm
D = zeros (N, k);
iterations = 0;
prevcenters = centers;
while (true)
## Compute distances
for i = 1:k
D (:, i) = sum (( data - repmat (centers (i, :), N, 1)).^2, 2);
endfor
## Classify
[~, classes] = min (D, [], 2);
## Recompute centers
for i = 1:k
centers (i, :) = mean (data (classes == i, :));
endfor
## Check for convergence
iterations++;
if (all (centers (:) == prevcenters (:)) || iterations >= maxiter)
break;
endif
prevcenters = centers;
endwhile
endfunction
%!demo
%! ## Generate a two-cluster problem
%! C1 = randn (100, 2) + 1;
%! C2 = randn (100, 2) - 1;
%! data = [C1; C2];
%!
%! ## Perform clustering
%! [idx, centers] = kmeans (data, 2);
%!
%! ## Plot the result
%! figure
%! plot (data (idx==1, 1), data (idx==1, 2), 'ro');
%! hold on
%! plot (data (idx==2, 1), data (idx==2, 2), 'bs');
%! plot (centers (:, 1), centers (:, 2), 'kv', 'markersize', 10);
%! hold off
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev