What I was wondering after reading about spark pipe RDD is that we can execute any python code (including machine learning ) . The code is going to execute in distributed manner as well.
So if we can run machine learning code in distributed manner with pipeRDD what's the usefulness of Spark ML. Is there anything fundamental difference between running a python ML code via spark pipeRDD vs Spark ML. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org