Regarding selective compilation, you can hide sources behind a Maven
profile such as `-Pvectorized`. Check out what we do to switch between the
`hive-1.2` and `hive-2.3` profiles where different source directories are
grabbed at compile-time (the hive-1.2 profile was recently removed so you
might have to go back a little in git history). This won't do it
automatically based on JDK version, but it's probably good enough. At
runtime you can more easily do a JDK version check -- I agree with Sean on
loading via reflection.

Personally, I see no reason not to start adding this support in preparation
for broader adoption of JDK 16, provided that it is properly protected
behind flags. This could be a big win for installations which haven't gone
through the process of installing native BLAS libs.

On Tue, Dec 15, 2020 at 7:10 AM Sean Owen <sro...@gmail.com> wrote:

> Yes it's intriguing, though as you say not readily available in the wild
> yet.
> I would also expect native BLAS to outperform f2j also, so yeah that's the
> interesting question, whether this is a win over native code or not.
> I suppose the upside is eventually, we may expect this API to be available
> in all JVMs, not just those with native libraries added at runtime.
>
> I wonder if a short-term goal would be to ensure that these calls are
> simply abstracted away, which they should already me, so it's easy to plug
> in this new 'BLAS' implementation. I'm sure it's possible to load this
> selectively via reflection, as that's what the current libraries do.
> And there may be additional code paths that could benefit from these
> operations that don't already.
>
> On Tue, Dec 15, 2020 at 8:30 AM Ludovic Henry
> <luhe...@microsoft.com.invalid> wrote:
>
>> Hello,
>>
>>
>>
>> I’ve, over the past few days, looked into using the new Vector API [1] to
>> accelerate some BLAS operations straight from Java. You can find a gist at
>> [2] containing most of the changes in
>> mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala.
>>
>>
>>
>> To measure performance, I’ve added a BLASBenchmark.scala [3] at
>> mllib-local/src/test/scala/org/apache/spark/ml/linalg/BLASBenchmark.scala.
>> I do see some promising speedups, especially compared to F2jBLAS. I’ve
>> unfortunately not been able to install OpenBLAS locally and compare
>> performance to native, but I would still expect native to be faster,
>> especially on large inputs. See [4] for some f2j vs vector performance
>> comparison.
>>
>>
>>
>> The primary blocker is that the Vector API is only available in incubator
>> mode, starting with JDK 16. We can have an easy run-time check whether we
>> can use the Vectorized BLAS. But, to compile the Vectorized BLAS class, we
>> need JDK 16+. Spark 3.0+ does compile with JDK 16 (it works locally), but I
>> don’t know how to selectively compile sources based on the JDK version used
>> at compile-time.
>>
>>
>>
>> But much more importantly, I want to get your feedback before I keep
>> exploring this idea further. Technically, it is feasible, and we’ll observe
>> speed up whenever the native BLAS is not installed. Moreover, I am solely
>> focusing on ML/MLLib for now. However, there is still graphx (I haven’t
>> checked if there is anything vectorizable) and even supporting more
>> explicit use of the Vector API in catalyst, which is a much bigger project.
>>
>>
>>
>> Thank you,
>>
>> Ludovic Henry
>>
>>
>>
>> [1] https://openjdk.java.net/jeps/338
>>
>> [2]
>> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-blas-scala
>>
>> [3]
>> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-blasbenchmark-scala
>>
>> [4]
>> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-f2j-vs-vector-log
>>
>

Reply via email to