A sensible default strategy is to use the same language in which a system
was developed or a highly compatible language. That would be Scala for
Spark, however I assume you don't currently know Scala to the same degree
as Python or at all. In which case to help you make the decision you should
also
Spark is written in Scala, so yes it's still the strongest option. You
also get the Dataset type with Scala (compile time type-safety), and that's
not an available feature with Python.
That said, I think the Python API is a viable candidate if you use Pandas
for Data Science. There are similarit
Hello all,
I will be starting a new Spark codebase and I would like to get opinions on
using Python over Scala. Historically, the Scala API has always been the
strongest interface to Spark. Is this still true? Are there still many
benefits and additional features in the Scala API that are not avai