Well, after 5 months (I had to leave it for about 3 months, and there was
some waiting on the MST algo pull), my PR on the EAC algorithm is ready for
review. PR#1830 <https://github.com/scikit-learn/scikit-learn/pull/1830>
This implements the Evidence Accumulation Clustering
algorithm<http://scholar.google.com.au/scholar?cluster=4805287106484569229&hl=en&as_sdt=0,5>
which
is a framework for cluster ensembling. By default, the algorithm uses
K-means and MSTCluster, but both of these can be changed.
The MSTCluster used in here is also, I feel, very intuitive. The threshold
value is the inverse of the percentage of times k-means clusters two
objects together. Set at 0.8-0.9, this does quite well on a number of
datasets.
So, if anyone has some time, please have a look at the PR and let me know
if there are any improvements.
Thanks,
Robert
--
Public key at: http://pgp.mit.edu/ Search for this email address and select
the key from "2011-08-19" (key id: 54BA8735)
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general