Vincent commented on SPARK-21688:

[~srowen] Thanks for your comments. I think if user decides to use native blas, 
they should be aware of the threading configuration impacts, checking this env 
variable in mllib doesnt make sense; and no, actually we didn't just present 
the best-case result, instead, we took the average value of the 3-run tests for 
each case, and the result shows, for small dataset native blas might not have 
advantage over f2j, but the gap is small and we would expect that big data 
processing is more common case here.

> performance improvement in mllib SVM with native BLAS 
> ------------------------------------------------------
>                 Key: SPARK-21688
>                 URL: https://issues.apache.org/jira/browse/SPARK-21688
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 2.2.0
>         Environment: 4 nodes: 1 master node, 3 worker nodes
> model name      : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
> Memory : 180G
> num of core per node: 10
>            Reporter: Vincent
>         Attachments: ddot unitest.png, mllib svm training.png, 
> native-trywait.png, svm1.png, svm2.png, svm-mkl-1.png, svm-mkl-2.png
> in current mllib SVM implementation, we found that the CPU is not fully 
> utilized, one reason is that f2j blas is set to be used in the HingeGradient 
> computation. As we found out earlier 
> (https://issues.apache.org/jira/browse/SPARK-21305) that with proper 
> settings, native blas is generally better than f2j on the uni-test level, 
> here we make the blas operations in SVM go with MKL blas and get an end to 
> end performance report showing that in most cases native blas outperformance 
> f2j blas up to 50%.
> So, we suggest removing those f2j-fixed calling and going for native blas if 
> available. If this proposal is acceptable, we will move on to benchmark other 
> algorithms impacted. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to