> I am working on a summary table on clustering methods. It is not > finished, I need to do a bit more literature review, however, I'd love > some feedback on the current status: > https://github.com/GaelVaroquaux/scikit-learn/blob/master/doc/modules/clustering.rst > > > Thanks for starting on this overview. For the input, I would hope we can implement Olivier's proposal soon so that we don't need to differentiate the different input types.
I'm not sure if "flat geometry" is a good way to describe the case that KMeans works in. I would have said "convex clusters". Not sure in how far that applies to hierarchical clustering, though. Adding to what Olivier said, I would not call KMeans and Spectral clustering "scaling well". I think stuff scales well if I can run MNIST on my laptop ;) By the way, do you know why mean-shift performs so poorly on the first two examples? Is it because the bandwidth is not set "correctly"? Also, I would mention explicitly that often clustering algorithms are evaluated using ARI or AMI using classification data, since there is not really any other data available, and why this is bad ;) I am just working on a clustering algorithm and it is really hard to say what it means for a clustering algorithm to fail. Oh and one more thing: For spectral clustering, I think we implement the Shi/Malik version, not the Jordan/Ng version. Though adding this as an option would probably be quite easy. This should probably also be made explicit in the docs. Btw is there an easy way to do a diff to master of this file without checking it out? Thanks for giving more structure to the narratives! I feel this is very important work. Cheers, Andy ps: Maybe I'll find time to do the "fit_distance"/"fit_kernel" API in one or two weeks. ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
