Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/18936
I'm not actually sure why the f2jblas implementation was originally forced
in level 1, but IIRC it was due to some benchmarking at that time. The
intuition is that the overhead of calling into native BLAS exceeds the benefit
given that pure JVM for level 1 is "fast enough".
Originally in the ALS recommend work the benchmarks showed f2jblas was
faster, but then later it seemed the poor native performance was perhaps due to
threading settings etc. So it does bear some investigation.
As Sean mentions, I think there needs to be some more extensive
benchmarking than just a few methods and one of the algorithms. Also note that
"native" !== "mkl", so that benchmarking needs to include at least OpenBLAS and
any others that are common.
Note that this would only impact `mllib` since the `ml` aggregators do the
on-the-fly standardization and don't in general use the level 1 BLAS ops. So I
don't think the overall impact here is that important since `mllib` is in
maintenance mode.
Overall, if we go ahead with this kind of change then we need to benchmark
standalone and all the code paths it touches, and furthermore decide if it is
worth the change to impact `mllib` only.
cc also @sethah
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]