Regarding selective compilation, you can hide sources behind a Maven profile such as `-Pvectorized`. Check out what we do to switch between the `hive-1.2` and `hive-2.3` profiles where different source directories are grabbed at compile-time (the hive-1.2 profile was recently removed so you might have to go back a little in git history). This won't do it automatically based on JDK version, but it's probably good enough. At runtime you can more easily do a JDK version check -- I agree with Sean on loading via reflection.
Personally, I see no reason not to start adding this support in preparation for broader adoption of JDK 16, provided that it is properly protected behind flags. This could be a big win for installations which haven't gone through the process of installing native BLAS libs. On Tue, Dec 15, 2020 at 7:10 AM Sean Owen <sro...@gmail.com> wrote: > Yes it's intriguing, though as you say not readily available in the wild > yet. > I would also expect native BLAS to outperform f2j also, so yeah that's the > interesting question, whether this is a win over native code or not. > I suppose the upside is eventually, we may expect this API to be available > in all JVMs, not just those with native libraries added at runtime. > > I wonder if a short-term goal would be to ensure that these calls are > simply abstracted away, which they should already me, so it's easy to plug > in this new 'BLAS' implementation. I'm sure it's possible to load this > selectively via reflection, as that's what the current libraries do. > And there may be additional code paths that could benefit from these > operations that don't already. > > On Tue, Dec 15, 2020 at 8:30 AM Ludovic Henry > <luhe...@microsoft.com.invalid> wrote: > >> Hello, >> >> >> >> I’ve, over the past few days, looked into using the new Vector API [1] to >> accelerate some BLAS operations straight from Java. You can find a gist at >> [2] containing most of the changes in >> mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala. >> >> >> >> To measure performance, I’ve added a BLASBenchmark.scala [3] at >> mllib-local/src/test/scala/org/apache/spark/ml/linalg/BLASBenchmark.scala. >> I do see some promising speedups, especially compared to F2jBLAS. I’ve >> unfortunately not been able to install OpenBLAS locally and compare >> performance to native, but I would still expect native to be faster, >> especially on large inputs. See [4] for some f2j vs vector performance >> comparison. >> >> >> >> The primary blocker is that the Vector API is only available in incubator >> mode, starting with JDK 16. We can have an easy run-time check whether we >> can use the Vectorized BLAS. But, to compile the Vectorized BLAS class, we >> need JDK 16+. Spark 3.0+ does compile with JDK 16 (it works locally), but I >> don’t know how to selectively compile sources based on the JDK version used >> at compile-time. >> >> >> >> But much more importantly, I want to get your feedback before I keep >> exploring this idea further. Technically, it is feasible, and we’ll observe >> speed up whenever the native BLAS is not installed. Moreover, I am solely >> focusing on ML/MLLib for now. However, there is still graphx (I haven’t >> checked if there is anything vectorizable) and even supporting more >> explicit use of the Vector API in catalyst, which is a much bigger project. >> >> >> >> Thank you, >> >> Ludovic Henry >> >> >> >> [1] https://openjdk.java.net/jeps/338 >> >> [2] >> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-blas-scala >> >> [3] >> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-blasbenchmark-scala >> >> [4] >> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-f2j-vs-vector-log >> >