Minibatch K-means should work just fine. Alternatively there are hebbian
K-means approaches which are quite easy to implement and should be fast
(though I think it basically boils down to minibatch K-means, I haven't
looked at details of minibatch K-means). There is an approach here
http://www.iro.umontreal.ca/~memisevr/code.html that could be useful once
the website is fixed...

I have run the hebbian K-means approach over CIFAR10, so it should work for
MNIST.

On Thu, Jun 18, 2015 at 8:47 AM, Vince Fernando <y...@vincefernando.co.uk>
wrote:

> What is best routine in scikit-learn (or elsewhere) for clustering large
> data sets such as MNIST?
> I asked a similar question last year but would like to hear an update.
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to