luhenry commented on a change in pull request #32253:
URL: https://github.com/apache/spark/pull/32253#discussion_r619297422



##########
File path: mllib-local/pom.xml
##########
@@ -75,48 +75,12 @@
       <type>test-jar</type>
       <scope>test</scope>
     </dependency>
+
+    <dependency>
+      <groupId>dev.ludovic.netlib</groupId>
+      <artifactId>blas</artifactId>
+    </dependency>
   </dependencies>
-  <profiles>
-    <profile>
-      <id>netlib-lgpl</id>

Review comment:
       > The typical policy is that it's OK to release software that can merely 
make use of such libraries at runtime (without actually distributing them 
directly) as long as it doesn't substantially depend on their presence. I 
believe that dynamic linking in the way you describe is OK - just like having 
an SPI in JVM code that may be provided by some other GPL code at the user's 
runtime.
   
   What you describe is exactly how `dev.ludovic.netlib` works. It doesn't 
substantially depend on OpenBLAS, MKL, or any other native BLAS library to be 
there as it will fall back to a pure Java implementation otherwise. The 
transition will be transparent, the feature will be equivalent, only the 
performance will be affected.
   
   > My main goal is to preserve current behavior.
   
   I fully agree with that, as we don't want to break current behavior nor 
bring in additional and unwanted dependencies.
   
   > Right now if someone has, say, MKL on their native lib path for the JVM, 
and built with this alternate profile, it'd be accelerated. If you're saying 
that still works, but would not require this separate build profile because of 
the different loading strategy, that's an improvement.
   
   That's exactly how it work with `dev.ludovic.netlib` in JDK16+ today with 
the implementation based on the Foreign Linker API, and that's how I will want 
it to work with the JNI-based implementation for JDK8 and JDK11 in the future.
   
   > Have you by chance tried this integration when OpenBLAS is present to 
verify it makes use of it?
   
   Yes, and it's a lot faster than F2J. The results in 
https://github.com/apache/spark/pull/32253#issue-619173915 for `native` are 
with the implementation based on the Foreign Linker API. You can see for 
`dgemm`, `f2j` is **18-30x slower** than `native` (aka OpenBLAS). I also needed 
to set `LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu` (on Ubuntu 20.04) for 
`libblas.so` to be on `ld` path.
   
   To make use of MKL, I only need to set 
`LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/latest/lib/intel64:/opt/intel/oneapi/compiler/latest/linux/compiler/lib/intel64_lin`
 and pass `-Ddev.ludovic.netlib.blas.nativeLib=mkl_rt`.
   
   I haven't tried with other BLAS implementations.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to