And you are an expert on python! Idiomatic... Please do everyone a favor and stop commenting on things you have no idea... I build ETL systems python that wiped java commercial stacks left and right. Pyspark was and is and will be a second class citizen in spark world. That has nothing to do with python. And as far as scala is concerned good luck with it...
On Sat, Oct 17, 2020, 8:53 AM Molotch <ma...@kth.se> wrote: > I would say the pros and cons of Python vs Scala is both down to Spark, the > languages in themselves and what kind of data engineer you will get when > you > try to hire for the different solutions. > > With Pyspark you get less functionality and increased complexity with the > py4j java interop compared to vanilla Spark. Why would you want that? Maybe > you want the Python ML tools and have a clear use case, then go for it. If > not, avoid the increased complexity and reduced functionality of Pyspark. > > Python vs Scala? Idiomatic Python is a lesson in bad programming > habits/ideas, there's no other way to put it. Do you really want > programmers > enjoying coding i such a language hacking away at your system? > > Scala might be far from perfect with the plethora of ways to express > yourself. But Python < 3.5 is not fit for anything except simple scripting > IMO. > > Doing exploratory data analysis in a Jupiter notebook, Pyspark seems like a > fine idea. Coding an entire ETL library including state management, the > whole kitchen including the sink, Scala everyday of the week. > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >