[Spark SQL Lack of ForEach Sink in Python]: Is there anyway to use a ForEach sink in a Python application?

Denis Li Tue, 01 Aug 2017 11:36:23 -0700

I am trying to use PySpark to read a Kafka stream and then write it to
Redis. However, PySpark does not have support for a ForEach sink. So, I am
thinking of reading the Kafka stream into a DataFrame in Python and then
sending that DataFrame into a Scala application to be written to Redis. Is
there a way to be able to do this? All I have found is to extract JVM
instance from SparkSession and doing something like this:


spark.sparkContext._jvm.com.application.writeToRedis(df._jdf)

Is this the correct approach?

[Spark SQL Lack of ForEach Sink in Python]: Is there anyway to use a ForEach sink in a Python application?

Reply via email to