There's an interesting analysis in this paper:
Fast K-Means with Accurate Bounds
http://proceedings.mlr.press/v48/newling16.pdf
On 3/26/20 3:40 AM, Alexandre Gramfort wrote:
hi,
I suspect Elkan is really winning when you have many centroids
so the conclusion is not systematic
my 2c
Alex
On Thu, Mar 26, 2020 at 3:18 AM mc_george...@hotmail.com
<mailto:mc_george...@hotmail.com> <mc_george...@hotmail.com
<mailto:mc_george...@hotmail.com>> wrote:
Hi admins,
My team is working on optimization on scikit-learn staff now. When
it comes to kmeans, I find there are two algorithms, one of which
is lloyd and the other is elkan, which is the optimized one for
lloyd using triangle inequality. In the older version of
scikit-learn, elkan only supports dense dataset instead of sparse
one. And in the latest version, elkan supports both type of
datasets. So there is a question why both two algorithms are kept
in kmeans since they do the almost same thing and elkan is a
optimized one for lloyd. Are there any precision difference
between two algorithms and how can I decide what algorithm to use?
Best regards,
George Fan
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org <mailto:scikit-learn@python.org>
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn