I believe this would greatly depend on your use case and your familiarity with 
the languages.

In general, scala would have a much better performance than python and not all 
interfaces are available in python.
That said, if you are planning to use dataframes without any UDF then the 
performance hit is practically nonexistent.
Even if you need UDF, it is possible to write those in scala and wrap them for 
python and still get away without the performance hit.
Python does not have interfaces for UDAFs.

I believe that if you have large structured data and do not generally need 
UDF/UDAF you can certainly work in python without losing too much.


From: ayan guha [mailto:guha.a...@gmail.com]
Sent: Thursday, September 01, 2016 5:03 AM
To: user
Subject: Scala Vs Python

Hi Users

Thought to ask (again and again) the question: While I am building any 
production application, should I use Scala or Python?

I have read many if not most articles but all seems pre-Spark 2. Anything 
changed with Spark 2? Either pro-scala way or pro-python way?

I am thinking performance, feature parity and future direction, not so much in 
terms of skillset or ease of use.

Or, if you think it is a moot point, please say so as well.

Any real life example, production experience, anecdotes, personal taste, 
profanity all are welcome :)

--
Best Regards,
Ayan Guha




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/RE-Scala-Vs-Python-tp27637.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to