Github user sethah commented on the issue:
https://github.com/apache/spark/pull/14937
@yanboliang I ran some tests on a 3 node bare-metal cluster, 144 cores, 384
gb ram on some dense synthetic data. I installed OpenBLAS customized for the
hardware on the nodes (I can confirm it's successfully using NativeBLAS, not
positive it's optimized though).
With this patch at first, I was seeing something like 10 minute iteration
times compared to master branch of ~30 seconds. After refactoring the code to
avoid some copying, I am still seeing about a 3-5x slowdown using this
approach. I am still working through some of the timings and I haven't done a
lot of experimentation with the block size. I will give more details at some
point. For now, I can point out that copying the center in
[here](https://github.com/yanboliang/spark/blob/1c31cda0f78b8c2b11406d76da447e9b3216a97d/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala#L379)
seems to have a huge impact.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]