Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18936 I'm not actually sure why the f2jblas implementation was originally forced in level 1, but IIRC it was due to some benchmarking at that time. The intuition is that the overhead of calling into native BLAS exceeds the benefit given that pure JVM for level 1 is "fast enough". Originally in the ALS recommend work the benchmarks showed f2jblas was faster, but then later it seemed the poor native performance was perhaps due to threading settings etc. So it does bear some investigation. As Sean mentions, I think there needs to be some more extensive benchmarking than just a few methods and one of the algorithms. Also note that "native" !== "mkl", so that benchmarking needs to include at least OpenBLAS and any others that are common. Note that this would only impact `mllib` since the `ml` aggregators do the on-the-fly standardization and don't in general use the level 1 BLAS ops. So I don't think the overall impact here is that important since `mllib` is in maintenance mode. Overall, if we go ahead with this kind of change then we need to benchmark standalone and all the code paths it touches, and furthermore decide if it is worth the change to impact `mllib` only. cc also @sethah
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org