It's fine to prototype it. Because users can also get BLAS support by
enabling a profile already, I think it bears understanding if perf is at
least comparable before adding it as another option.
Or it could simply be an extra module / library until that time if it's
desirable to release.
This may be a nice testing ground to see how much the API can substitute in
for BLAS operations.

On Wed, Dec 16, 2020 at 4:41 AM Ludovic Henry <luhe...@microsoft.com> wrote:

> Hi,
>
>
>
> Thank you for the feedback. I’ll work on the profile-based approach to
> selectively compile this VectorBLAS class in. As for the run-time, I
> haven’t used specifically a reflection-based approach but a more simple
> `try { new VectorBLAS() } catch (NoClassDefFoundError) { new F2jBLAS() }`.
> I’ll submit a PR against gitHub.com/apache/spark with this change. Should I
> also fill up a bug inside the Jira as well?
>
>
>
> On a side note, I worked yesterday on extracting this code into a
> standalone project [1]. It’s not so much so that Spark can depend on that
> (even though it could be possible), but it is to make it easier to develop,
> test, and benchmark new implementations on my end.
>
>
>
> Thank you,
>
> Ludovic
>
>
>
> [1] https://github.com/luhenry/blas
>
>
>
> *From: *Erik Krogen <xkro...@apache.org>
> *Sent: *Tuesday, 15 December 2020 17:33
> *To: *Sean Owen <sro...@gmail.com>
> *Cc: *Ludovic Henry <luhe...@microsoft.com>; dev@spark.apache.org; Bernhard
> Urban-Forster <beu...@microsoft.com>
> *Subject: *Re: Usage of JDK Vector API in ML/MLLib
>
>
>
> Regarding selective compilation, you can hide sources behind a Maven
> profile such as `-Pvectorized`. Check out what we do to switch between the
> `hive-1.2` and `hive-2.3` profiles where different source directories are
> grabbed at compile-time (the hive-1.2 profile was recently removed so you
> might have to go back a little in git history). This won't do it
> automatically based on JDK version, but it's probably good enough. At
> runtime you can more easily do a JDK version check -- I agree with Sean on
> loading via reflection.
>
>
>
> Personally, I see no reason not to start adding this support in
> preparation for broader adoption of JDK 16, provided that it is properly
> protected behind flags. This could be a big win for installations which
> haven't gone through the process of installing native BLAS libs.
>
>
>
> On Tue, Dec 15, 2020 at 7:10 AM Sean Owen <sro...@gmail.com> wrote:
>
> Yes it's intriguing, though as you say not readily available in the wild
> yet.
>
> I would also expect native BLAS to outperform f2j also, so yeah that's the
> interesting question, whether this is a win over native code or not.
>
> I suppose the upside is eventually, we may expect this API to be available
> in all JVMs, not just those with native libraries added at runtime.
>
>
>
> I wonder if a short-term goal would be to ensure that these calls are
> simply abstracted away, which they should already me, so it's easy to plug
> in this new 'BLAS' implementation. I'm sure it's possible to load this
> selectively via reflection, as that's what the current libraries do.
>
> And there may be additional code paths that could benefit from these
> operations that don't already.
>
>
>
> On Tue, Dec 15, 2020 at 8:30 AM Ludovic Henry
> <luhe...@microsoft.com.invalid> wrote:
>
> Hello,
>
>
>
> I’ve, over the past few days, looked into using the new Vector API [1] to
> accelerate some BLAS operations straight from Java. You can find a gist at
> [2] containing most of the changes in
> mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala.
>
>
>
> To measure performance, I’ve added a BLASBenchmark.scala [3] at
> mllib-local/src/test/scala/org/apache/spark/ml/linalg/BLASBenchmark.scala.
> I do see some promising speedups, especially compared to F2jBLAS. I’ve
> unfortunately not been able to install OpenBLAS locally and compare
> performance to native, but I would still expect native to be faster,
> especially on large inputs. See [4] for some f2j vs vector performance
> comparison.
>
>
>
> The primary blocker is that the Vector API is only available in incubator
> mode, starting with JDK 16. We can have an easy run-time check whether we
> can use the Vectorized BLAS. But, to compile the Vectorized BLAS class, we
> need JDK 16+. Spark 3.0+ does compile with JDK 16 (it works locally), but I
> don’t know how to selectively compile sources based on the JDK version used
> at compile-time.
>
>
>
> But much more importantly, I want to get your feedback before I keep
> exploring this idea further. Technically, it is feasible, and we’ll observe
> speed up whenever the native BLAS is not installed. Moreover, I am solely
> focusing on ML/MLLib for now. However, there is still graphx (I haven’t
> checked if there is anything vectorizable) and even supporting more
> explicit use of the Vector API in catalyst, which is a much bigger project.
>
>
>
> Thank you,
>
> Ludovic Henry
>
>
>
> [1] https://openjdk.java.net/jeps/338
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopenjdk.java.net%2Fjeps%2F338&data=04%7C01%7Cluhenry%40microsoft.com%7C0529612745ad4559cf0608d8a1172a0d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436468156914676%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=QpoFg2EPrkCsbFHGUvK26opwpbVruQOwCde70o%2FE50s%3D&reserved=0>
>
> [2]
> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-blas-scala
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fluhenry%2F6b24ac146a110143ad31736caf7250e6%23file-blas-scala&data=04%7C01%7Cluhenry%40microsoft.com%7C0529612745ad4559cf0608d8a1172a0d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436468156924670%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=M%2Bir7vVGDxDamrXvwvrtqzhOEQ6TD7oJT3sf5fJ1Ovk%3D&reserved=0>
>
> [3]
> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-blasbenchmark-scala
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fluhenry%2F6b24ac146a110143ad31736caf7250e6%23file-blasbenchmark-scala&data=04%7C01%7Cluhenry%40microsoft.com%7C0529612745ad4559cf0608d8a1172a0d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436468156934671%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=2PRGL%2FeVB4QMGwpNyebTAKttjESnhek5LDSQuYRYawM%3D&reserved=0>
>
> [4]
> https://gist.github.com/luhenry/6b24ac146a110143ad31736caf7250e6#file-f2j-vs-vector-log
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fluhenry%2F6b24ac146a110143ad31736caf7250e6%23file-f2j-vs-vector-log&data=04%7C01%7Cluhenry%40microsoft.com%7C0529612745ad4559cf0608d8a1172a0d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436468156934671%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4FA7p18jd6yVnIvRGNNeDWA5%2F%2Fw249z6%2B%2BOuJhRnTBI%3D&reserved=0>
>
>
>

Reply via email to