DistributedLDAModel missing APIs in org.apache.spark.ml

2016-03-19 Thread cindymc
I like using the new DataFrame APIs on Spark ML, compared to using RDDs in
the older SparkMLlib.  But it seems some of the older APIs are missing.  In
particular, '*.mllib.clustering.DistributedLDAModel' had two APIs that I
need now:

topDocumentsPerTopic
topTopicsPerDocument

How can I get at the same results using the APIs on
'*.ml.clustering.DistributedLDAModel'?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/DistributedLDAModel-missing-APIs-in-org-apache-spark-ml-tp26535.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Using netlib-java in Spark 1.6 on linux

2016-03-03 Thread cindymc
I have these on my system:

/usr/lib64:$ /sbin/ldconfig -p | grep liblapack
liblapack.so.3 (libc6,x86-64) => /usr/lib64/atlas/liblapack.so.3
liblapack.so.3 (libc6,x86-64) => /usr/lib64/atlas-sse3/liblapack.so.3
liblapack.so.3 (libc6,x86-64) => /usr/lib64/liblapack.so.3
liblapack.so.3 (libc6) => /usr/lib/atlas/liblapack.so.3
liblapack.so.3 (libc6) => /usr/lib/liblapack.so.3
liblapack.so (libc6,x86-64) => /usr/lib64/liblapack.so
liblapack.so (libc6) => /usr/lib/liblapack.so
/usr/lib64:$ /sbin/ldconfig -p | grep libblas
libblas.so.3 (libc6,x86-64) => /usr/lib64/libblas.so.3
libblas.so.3 (libc6) => /usr/lib/libblas.so.3
libblas.so (libc6,x86-64) => /usr/lib64/libblas.so
libblas.so (libc6) => /usr/lib/libblas.so

And this in my  /etc/ld.so.conf:
include ld.so.conf.d/*.conf
/usr/lib64

Then ran 'ldconfig'.

I also have this:
/usr/lib64:$ yum list libgfortran
Loaded plugins: aliases, changelog, kabi, presto, refresh-packagekit,
security, tmprepo, ulninfo, verify, versionlock
Loading support for kernel ABI
Installed Packages
libgfortran.i686   
4.4.7-16.el6  @base 
 
libgfortran.x86_64 
4.4.7-16.el6  @anacond

I've set:
LD_LIBRARY_PATH=/usr/lib64

Still no luck.  Suggestions?  Do I have to 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-netlib-java-in-Spark-1-6-on-linux-tp26386p26392.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Using netlib-java in Spark 1.6 on linux

2016-03-02 Thread cindymc
I want to take advantage of the Breeze linear algebra libraries, built on
netlib-java, used heavily by SparkML. I've found this amazingly
time-consuming to figure out, and have only been able to do so on MacOS.  I
want to do same on Linux:

$ uname -a
Linux slc10whv 3.8.13-68.3.4.el6uek.x86_64 #2 SMP Tue Jul 14 15:03:36 PDT
2015 x86_64 x86_64 x86_64 GNU/Linux

This is for Spark 1.6.

For MacOS, I was able to find the *.jars in the .ivy2 cache and add them to
a combination of system and application classpaths.

For Linux, I've downloaded the Spark 1.6 source and compiled with sbt like
this:
sbt/sbt -Pyarn -DskipTests=true -Phadoop-2.3 -Dhadoop.version=2.6.0
-Pnetlib-lgpl clean update assembly package

This gives me 'spark-assembly-1.6.0-hadoop2.6.0.jar' that appears to contain
the *.so libs I need.  As an example:  netlib-native_ref-linux-x86_64.so

Now I want to compile and package my application so it picks these
netlib-java classes up at runtime.  Here's the command I'm using:

spark-submit --properties-file project-defaults.conf --class
"main.scala.SparkLDADemo" --jars
lib/stanford-corenlp-3.6.0.jar,lib/stanford-corenlp-3.6.0-models.jar,/scratch/cmcmulle/programs/spark/spark-1.6.0/assembly/target/scala-2.10/spark-assembly-1.6.0-hadoop2.6.0.jar
target/scala-2.10/sparksql-demo_2.10-1.0.jar

Still, I get the dreaded:
"16/03/02 16:49:21 WARN BLAS: Failed to load implementation from:
com.github.fommil.netlib.NativeSystemBLAS
16/03/02 16:49:21 WARN BLAS: Failed to load implementation from:
com.github.fommil.netlib.NativeRefBLAS"

Can someone please tell me how to build/configure/run a standalone SparkML
application using spark-submit such that it is able to load/use the
netlib-java classes?

Thanks --



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-netlib-java-in-Spark-1-6-on-linux-tp26386.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org