Re: Python performance

2022-02-06 Thread Hinko Kocevar
Thanks for your input guys! //hinko On 4 Feb 2022, at 14:58, Sean Owen wrote:  Yes, in the sense that any transformation that can be expressed in the SQL-like DataFrame API will push down to the JVM, and take advantage of other optimizations, avoiding the data movement to/from Python and mo

Re: Python performance

2022-02-04 Thread Sean Owen
Yes, in the sense that any transformation that can be expressed in the SQL-like DataFrame API will push down to the JVM, and take advantage of other optimizations, avoiding the data movement to/from Python and more. But you can't do this if you're expressing operations that are not in the DataFrame

Re: Python performance

2022-02-04 Thread Bitfox
Please see my this test: https://blog.cloudcache.net/computing-performance-comparison-for-words-statistics/ Don’t use Python RDD, using dataframe instead. Regards On Fri, Feb 4, 2022 at 5:02 PM Hinko Kocevar wrote: > I'm looking into using Python interface with Spark and came across this > [1]

Python performance

2022-02-04 Thread Hinko Kocevar
I'm looking into using Python interface with Spark and came across this [1] chart showing some performance hit when going with Python RDD. Data is ~ 7 years and for older version of Spark. Is this still the case with more recent Spark releases? I'm trying to understand what to expect from Pytho

Re: Scala vs Python performance differences

2015-01-16 Thread Davies Liu
o there's one data point, if only for the obvious data point comparing > computations in Scala to computations in pure Python. > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Scala-vs-Python-performance-differences

Re: Scala vs Python performance differences

2015-01-16 Thread philpearl
ons in pure Python. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Scala-vs-Python-performance-differences-tp4247p21190.html Sent from the Apache Spark User List mailing list archive at

Re: Scala vs Python performance differences

2014-11-12 Thread Samarth Mailinglist
I was about to ask this question. On Wed, Nov 12, 2014 at 3:42 PM, Andrew Ash wrote: > Jeremy, > > Did you complete this benchmark in a way that's shareable with those > interested here? > > Andrew > > On Tue, Apr 15, 2014 at 2:50 PM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >>

Re: Scala vs Python performance differences

2014-11-12 Thread Andrew Ash
Jeremy, Did you complete this benchmark in a way that's shareable with those interested here? Andrew On Tue, Apr 15, 2014 at 2:50 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > I'd also be interested in seeing such a benchmark. > > > On Tue, Apr 15, 2014 at 9:25 AM, Ian Ferreira >

Re: Scala vs Python performance differences

2014-04-15 Thread Nicholas Chammas
I'd also be interested in seeing such a benchmark. On Tue, Apr 15, 2014 at 9:25 AM, Ian Ferreira wrote: > This would be super useful. Thanks. > > On 4/15/14, 1:30 AM, "Jeremy Freeman" wrote: > > >Hi Andrew, > > > >I'm putting together some benchmarks for PySpark vs Scala. I'm focusing on > >ML

Re: Scala vs Python performance differences

2014-04-15 Thread Ian Ferreira
This would be super useful. Thanks. On 4/15/14, 1:30 AM, "Jeremy Freeman" wrote: >Hi Andrew, > >I'm putting together some benchmarks for PySpark vs Scala. I'm focusing on >ML algorithms, as I'm particularly curious about the relative performance >of >MLlib in Scala vs the Python MLlib API vs pur

Re: Scala vs Python performance differences

2014-04-14 Thread Jeremy Freeman
le.com/Scala-vs-Python-performance-differences-tp4247p4261.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Scala vs Python performance differences

2014-04-14 Thread Bin Wang
At least, Spark Streaming doesn't support Python at this moment, right? On Mon, Apr 14, 2014 at 6:48 PM, Andrew Ash wrote: > Hi Spark users, > > I've always done all my Spark work in Scala, but occasionally people ask > about Python and its performance impact vs the same algorithm > implementat

Scala vs Python performance differences

2014-04-14 Thread Andrew Ash
Hi Spark users, I've always done all my Spark work in Scala, but occasionally people ask about Python and its performance impact vs the same algorithm implementation in Scala. Has anyone done tests to measure the difference? Anecdotally I've heard Python is a 40% slowdown but that's entirely hea