Vincent created SPARK-21688:
-------------------------------
Summary: performance improvement in mllib SVM with native BLAS
Key: SPARK-21688
URL: https://issues.apache.org/jira/browse/SPARK-21688
Project: Spark
Issue Type: Improvement
Components: MLlib
Affects Versions: 2.2.0
Environment: 4 nodes: 1 master node, 3 worker nodes
model name : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
Memory : 180G
num of core per node: 10
Reporter: Vincent
in current mllib SVM implementation, we found that the CPU is not fully
utilized, one reason is that f2j blas is set to be used in the HingeGradient
computation. As we found out earlier
(https://issues.apache.org/jira/browse/SPARK-21305) that with proper settings,
native blas is generally better than f2j on the uni-test level, here we make
the blas operations in SVM go with MKL blas and get an end to end performance
report showing that in most cases native blas outperformance f2j blas up to 50%.
So, we suggest removing those f2j-fixed calling and going for native blas if
available. If this proposal is acceptable, we will move on to benchmark other
algorithms impacted.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]