> I am working on a summary table on clustering methods. It is not
> finished, I need to do a bit more literature review, however, I'd love
> some feedback on the current status:
> https://github.com/GaelVaroquaux/scikit-learn/blob/master/doc/modules/clustering.rst
>
>
>    
Thanks for starting on this overview.
For the input, I would hope we can implement Olivier's proposal soon
so that we don't need to differentiate the different input types.

I'm not sure if "flat geometry" is a good way to describe the case that
KMeans works in. I would have said "convex clusters". Not sure in how far
that applies to hierarchical clustering, though.

Adding to what Olivier said, I would not call KMeans and Spectral
clustering "scaling well".  I think stuff scales well if I can run
MNIST on my laptop ;)

By the way, do you know why mean-shift performs so poorly on the
first two examples? Is it because the bandwidth is not set "correctly"?

Also, I would mention explicitly that often clustering algorithms are
evaluated using ARI or AMI using classification data, since there
is not really any other data available, and why this is bad ;)

I am just working on a clustering algorithm and it is really hard to
say what it means for a clustering algorithm to fail.

Oh and one more thing: For spectral clustering, I think we implement
the Shi/Malik version, not the Jordan/Ng version. Though adding
this as an option would probably be quite easy. This should probably
also be made explicit in the docs.

Btw is there an easy way to do a diff to master of this file without 
checking it out?

Thanks for giving more structure to the narratives!
I feel this is very important work.

Cheers,
Andy

ps: Maybe I'll find time to do the "fit_distance"/"fit_kernel" API in 
one or two weeks.




------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to