Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/14937
  
    @yanboliang I ran some tests on a 3 node bare-metal cluster, 144 cores, 384 
gb ram on some dense synthetic data. I installed OpenBLAS customized for the 
hardware on the nodes (I can confirm it's successfully using NativeBLAS, not 
positive it's optimized though).
    
    With this patch at first, I was seeing something like 10 minute iteration 
times compared to master branch of ~30 seconds. After refactoring the code to 
avoid some copying, I am still seeing about a 3-5x slowdown using this 
approach. I am still working through some of the timings and I haven't done a 
lot of experimentation with the block size. I will give more details at some 
point. For now, I can point out that copying the center in 
[here](https://github.com/yanboliang/spark/blob/1c31cda0f78b8c2b11406d76da447e9b3216a97d/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala#L379)
 seems to have a huge impact. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to