luhenry commented on a change in pull request #30810:
URL: https://github.com/apache/spark/pull/30810#discussion_r547392950
##########
File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala
##########
@@ -18,28 +18,51 @@
package org.apache.spark.ml.linalg
import com.github.fommil.netlib.{BLAS => NetlibBLAS, F2jBLAS}
-import com.github.fommil.netlib.BLAS.{getInstance => NativeBLAS}
/**
* BLAS routines for MLlib's vectors and matrices.
*/
private[spark] object BLAS extends Serializable {
- @transient private var _f2jBLAS: NetlibBLAS = _
+ @transient private var _javaBLAS: NetlibBLAS = _
@transient private var _nativeBLAS: NetlibBLAS = _
private val nativeL1Threshold: Int = 256
- // For level-1 function dspmv, use f2jBLAS for better performance.
- private[ml] def f2jBLAS: NetlibBLAS = {
- if (_f2jBLAS == null) {
- _f2jBLAS = new F2jBLAS
+ // For level-1 function dspmv, use javaBLAS for better performance.
+ private[ml] def javaBLAS: NetlibBLAS = {
+ if (_javaBLAS == null) {
+ _javaBLAS =
+ try {
+ // scalastyle:off classforname
+ Class.forName("org.apache.spark.ml.linalg.VectorizedBLAS", true,
+ Option(Thread.currentThread().getContextClassLoader)
+ .getOrElse(getClass.getClassLoader))
+ .newInstance()
+ .asInstanceOf[NetlibBLAS]
+ // scalastyle:on classforname
+ } catch {
+ case _: Throwable => new F2jBLAS
+ }
+ }
+ _javaBLAS
+ }
+
+ // For level-3 routines, we use the native BLAS.
+ private[ml] def nativeBLAS: NetlibBLAS = {
+ if (_nativeBLAS == null) {
+ _nativeBLAS =
+ if (NetlibBLAS.getInstance.isInstanceOf[F2jBLAS]) {
+ javaBLAS
+ } else {
+ NetlibBLAS.getInstance
+ }
}
- _f2jBLAS
+ _nativeBLAS
}
private[ml] def getBLAS(vectorSize: Int): NetlibBLAS = {
if (vectorSize < nativeL1Threshold) {
- f2jBLAS
+ javaBLAS
Review comment:
I would argue for the vector implementation to "replace" the f2j one
when possible instead of the native one for three reasons. First, the vector
implementation is faster than f2j in all cases. Second, the vector
implementation doesn't suffer from the overhead of switching between java and
native that native has (these are all compiler intrinsic and is completely
transparent to the user of the Vector API). And third, because the vector
implementation doesn't depend on any other external dependency (other than a
recent enough JDK that is used to run the whole of Spark) similarly to f2j.
The current fallback chain is as follows:
- for `javaBLAS`: 1. vector, 2. f2j
- for `nativeBLAS`: 1. native, 2. vector, 3. f2j
That guarantees to get access to the fastest implementation available given
the set of options enabled (compiled with `-Pvectorized`, compiled with JDK16+,
ran with JDK16+, installed OpenBLAS or MLK).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]