Hi Andrew, I'm putting together some benchmarks for PySpark vs Scala. I'm focusing on ML algorithms, as I'm particularly curious about the relative performance of MLlib in Scala vs the Python MLlib API vs pure Python implementations.
Will share real results as soon as I have them, but roughly, in our hands, that 40% number is ballpark correct, at least for some basic operations (e.g textFile, count, reduce). -- Jeremy --------------------- Jeremy Freeman, PhD Neuroscientist @thefreemanlab -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Scala-vs-Python-performance-differences-tp4247p4261.html Sent from the Apache Spark User List mailing list archive at Nabble.com.