I am trying to use PySpark to read a Kafka stream and then write it to
Redis. However, PySpark does not have support for a ForEach sink. So, I am
thinking of reading the Kafka stream into a DataFrame in Python and then
sending that DataFrame into a Scala application to be written to Redis. Is
there a way to be able to do this? All I have found is to extract JVM
instance from SparkSession and doing something like this:

spark.sparkContext._jvm.com.application.writeToRedis(df._jdf)

Is this the correct approach?

Reply via email to