I am trying to write key-values to redis using a DataStreamWriter object
using pyspark structured streaming APIs. I am using Spark 2.2

Since the Foreach Sink is not supported for python; here
<http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#using-foreach>,
I am trying to find out some alternatives.

One alternative is to write a separate Scala module only to push data into
redis using foreach; ForeachWriter
<http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.ForeachWriter>
is
supported in Scala. BUT this doesn't seem like an efficient approach and
adds deployment overhead because now I will have to support Scala in my app.

Another approach is obviously to use Scala instead of python, which is fine
but I want to make sure that I absolutely cannot use python for this
problem before I take this path.

Would appreciate some feedback and alternative design approaches for this
problem.

Thanks.

Reply via email to