Re: [scikit-learn] A basic question about kmeans algorithms elkan and llyod

Andreas Mueller Fri, 27 Mar 2020 09:39:01 -0700

There's an interesting analysis in this paper:
Fast K-Means with Accurate Bounds


http://proceedings.mlr.press/v48/newling16.pdf


On 3/26/20 3:40 AM, Alexandre Gramfort wrote:

hi,

I suspect Elkan is really winning when you have many centroids
so the conclusion is not systematic

my 2c
Alex

On Thu, Mar 26, 2020 at 3:18 AM [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> wrote:


    Hi admins,

    My team is working on optimization on scikit-learn staff now. When
    it comes to kmeans, I find there are two algorithms, one of which
    is lloyd and the other is elkan, which is the optimized one for
    lloyd using triangle inequality.  In the older version of
    scikit-learn, elkan only supports dense dataset instead of sparse
    one. And in the latest version, elkan supports both type of
    datasets. So there is a question why both two algorithms are kept
    in kmeans since they do the almost same thing and elkan is a
    optimized one for lloyd. Are there any precision difference
    between two algorithms and how can I decide what algorithm to use?

    Best regards,

    George Fan

    _______________________________________________
    scikit-learn mailing list
    [email protected] <mailto:[email protected]>
    https://mail.python.org/mailman/listinfo/scikit-learn


_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] A basic question about kmeans algorithms elkan and llyod

Reply via email to