DistributedLDAModel missing APIs in org.apache.spark.ml
I like using the new DataFrame APIs on Spark ML, compared to using RDDs in the older SparkMLlib. But it seems some of the older APIs are missing. In particular, '*.mllib.clustering.DistributedLDAModel' had two APIs that I need now: topDocumentsPerTopic topTopicsPerDocument How can I get at the same results using the APIs on '*.ml.clustering.DistributedLDAModel'? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DistributedLDAModel-missing-APIs-in-org-apache-spark-ml-tp26535.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Using netlib-java in Spark 1.6 on linux
I have these on my system: /usr/lib64:$ /sbin/ldconfig -p | grep liblapack liblapack.so.3 (libc6,x86-64) => /usr/lib64/atlas/liblapack.so.3 liblapack.so.3 (libc6,x86-64) => /usr/lib64/atlas-sse3/liblapack.so.3 liblapack.so.3 (libc6,x86-64) => /usr/lib64/liblapack.so.3 liblapack.so.3 (libc6) => /usr/lib/atlas/liblapack.so.3 liblapack.so.3 (libc6) => /usr/lib/liblapack.so.3 liblapack.so (libc6,x86-64) => /usr/lib64/liblapack.so liblapack.so (libc6) => /usr/lib/liblapack.so /usr/lib64:$ /sbin/ldconfig -p | grep libblas libblas.so.3 (libc6,x86-64) => /usr/lib64/libblas.so.3 libblas.so.3 (libc6) => /usr/lib/libblas.so.3 libblas.so (libc6,x86-64) => /usr/lib64/libblas.so libblas.so (libc6) => /usr/lib/libblas.so And this in my /etc/ld.so.conf: include ld.so.conf.d/*.conf /usr/lib64 Then ran 'ldconfig'. I also have this: /usr/lib64:$ yum list libgfortran Loaded plugins: aliases, changelog, kabi, presto, refresh-packagekit, security, tmprepo, ulninfo, verify, versionlock Loading support for kernel ABI Installed Packages libgfortran.i686 4.4.7-16.el6 @base libgfortran.x86_64 4.4.7-16.el6 @anacond I've set: LD_LIBRARY_PATH=/usr/lib64 Still no luck. Suggestions? Do I have to -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-netlib-java-in-Spark-1-6-on-linux-tp26386p26392.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Using netlib-java in Spark 1.6 on linux
I want to take advantage of the Breeze linear algebra libraries, built on netlib-java, used heavily by SparkML. I've found this amazingly time-consuming to figure out, and have only been able to do so on MacOS. I want to do same on Linux: $ uname -a Linux slc10whv 3.8.13-68.3.4.el6uek.x86_64 #2 SMP Tue Jul 14 15:03:36 PDT 2015 x86_64 x86_64 x86_64 GNU/Linux This is for Spark 1.6. For MacOS, I was able to find the *.jars in the .ivy2 cache and add them to a combination of system and application classpaths. For Linux, I've downloaded the Spark 1.6 source and compiled with sbt like this: sbt/sbt -Pyarn -DskipTests=true -Phadoop-2.3 -Dhadoop.version=2.6.0 -Pnetlib-lgpl clean update assembly package This gives me 'spark-assembly-1.6.0-hadoop2.6.0.jar' that appears to contain the *.so libs I need. As an example: netlib-native_ref-linux-x86_64.so Now I want to compile and package my application so it picks these netlib-java classes up at runtime. Here's the command I'm using: spark-submit --properties-file project-defaults.conf --class "main.scala.SparkLDADemo" --jars lib/stanford-corenlp-3.6.0.jar,lib/stanford-corenlp-3.6.0-models.jar,/scratch/cmcmulle/programs/spark/spark-1.6.0/assembly/target/scala-2.10/spark-assembly-1.6.0-hadoop2.6.0.jar target/scala-2.10/sparksql-demo_2.10-1.0.jar Still, I get the dreaded: "16/03/02 16:49:21 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 16/03/02 16:49:21 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS" Can someone please tell me how to build/configure/run a standalone SparkML application using spark-submit such that it is able to load/use the netlib-java classes? Thanks -- -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-netlib-java-in-Spark-1-6-on-linux-tp26386.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org