Hi,

I am getting the following Warning when i run the pyspark job:

My Code is

mat = RowMatrix(tf_rdd_vec.cache())  # RDD is cached
svd = mat.computeSVD(num_topics, computeU=False)

I am using Ubuntu 16.04 EC2 instance. And I have installed all following 
libraries into my system.

sudo apt install libarpack2 Arpack++ libatlas-base-dev liblapacke-dev 
libblas-dev gfortran libblas-dev liblapack-dev libnetlib-java libgfortran3 
libatlas3-base libopenblas-base

Now when i list /usr/lib directory it shown me the .so files

ubuntu:~$ ls /usr/lib/*.so | grep "pack\|blas"
/usr/lib/libarpack.so
/usr/lib/libblas.so
/usr/lib/libcblas.so
/usr/lib/libf77blas.so
/usr/lib/liblapack_atlas.so
/usr/lib/liblapacke.so
/usr/lib/liblapack.so
/usr/lib/libopenblasp-r0.2.18.so
/usr/lib/libopenblas.so
/usr/lib/libparpack.so

I have adjusted LD_LIBRARY_PATH to point to above directory as well.

export LD_LIBRARY_PATH=/var/lib/

But Still I am not able to use the Native ARPACK implementation. Also I am 
Caching the RDD passing to matrix But it still throws Cache WARNING Any 
suggestion how to solve these 3 Warnings ?

I have downloaded compiled version of spark-2.2.0 from the spark download page.

StackOverflow Link: 
https://stackoverflow.com/questions/46612006/how-to-properly-setup-native-arpack-for-spark-2-2-0

Best Regards,

Reply via email to