New to Spark and MLlib. Coming from sickit learn.

I am launching my Spark 1.6 instance through AWS EMR and pyspark. All the 
examples using Mllib work fine.

But I have seen a couple examples where you can combine scikit learn packages 
and syntax with mllib.

Like in this example- 
https://databricks.com/blog/2016/02/08/auto-scaling-scikit-learn-with-spark.html

However, it does not seem that Pyspark on AWS EMR comes with scikit (or other 
standard pydata packages) loaded.

Is this something you can/should load on pyspark and how would you do it?

Thanks for assisting.


Myles

Reply via email to