Dennis Aumiller created SPARK-24674:
---------------------------------------
Summary: Spark on Kubernetes BLAS performance
Key: SPARK-24674
URL: https://issues.apache.org/jira/browse/SPARK-24674
Project: Spark
Issue Type: Question
Components: Build, Kubernetes, MLlib
Affects Versions: 2.3.1
Environment: Spark 2.3.1 SNAPSHOT (as of June 25th)
Kubernetes version 1.7.5
Kubernetes cluster, consisting of 4 Nodes with 16 GB RAM, 8 core Intel
processors.
Reporter: Dennis Aumiller
Usually native BLAS libraries speed up the execution time of CPU-heavy
operations as for example in MLlib quite significantly.
Of course, the initial error
{code:java}
WARN BLAS:61 - Failed to load implementation from:
com.github.fommil.netlib.NativeSystemBLAS
{code}
can be resolved not so easily, since, as reported
[here|[https://github.com/apache/spark/pull/19717/files/7d2b30373b2e4d8d5311e10c3f9a62a2d900d568],]
this seems to be the issue because of the underlying image used by the Spark
Dockerfile.
Re-building spark with
{code:java}
-Pnetlib-lgpl
{code}
also does not solve the problem, but I managed to build BLAS and LAPACK into
Alpine, with a lot of tricks involved.
Interestingly, I noticed that the performance of PCA in my case dropped quite
significantly (with BLAS support, compared to the netlib-java fallback). I am
aware of [#SPARK-21305] as well, but that did not help my case, either.
Furthermore, calling SVD on a matrix of only size 5000x5000 (density 1%)
already throws an error when trying to use native ARPACK, but runs perfectly
fine with the fallback version.
The question would be whether there has been some investigation in that
direction already.
Or, if not, whether it would be interesting for the Spark community to provide
a
* more detailed report with respect to timings/configurations/test setup
* a provided Dockerfile to build Spark with BLAS/LAPACK/ARPACK using the
shipped Dockerfile as a basis
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]