Well your mileage varies so to speak.

   - Spark itself is written in Scala. However, that does not imply you
   should stick with Scala.
   - I have used both for spark streaming and spark structured streaming,
   they both work fine
   - PySpark has become popular with the widespread use of iData Science
   projects
   - What matters normally is the skill set you already have in-house. The
   likelihood is that there are more Python developers than Scala developers
   and the learning curve for scala has to be taken into account
   - The idea of performance etc is tangential.
   -  With regard to the Spark code itself, there should be little efforts
   in converting from Scala to PySpark or vice-versa

HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Wed, 2 Nov 2022 at 08:54, Joris Billen <joris.bil...@bigindustries.be>
wrote:

> Dear community,
> I had a general question about the use of scala VS pyspark for spark
> streaming.
> I believe spark streaming will work most efficiently when written in
> scala. I believe however that things can be implemented in pyspark. My
> question:
> 1)is it completely dumb to make a streaming job in pyspark?
> 2)what are the technical reasons that it is done best in scala (is this
> easy to understand why)?
> 3)any good links anyone has seen with numbers of the difference in
> performance and under what circumstances+explanation?
> 4)are there certain scenarios when the use of pyspark can be motivated
> (maybe when someone doesn’t feel confortable writing a job in scala and the
> number of messages/minute aren’t gigantic so performance isnt that crucial?)
>
> Thanks for any input!
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

Reply via email to