Hi
I'm wondering how does readStream() and writeStream() work internally
Lets take a simple example:
df = spark.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", kafka_brokers) \
.option("subscribe", kafka_topic) \
This is a mistake in the code snippet I posted.
The right code that is actually running and producing the error is:
/ df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "kafka_broker") \
.option("subscribe", "test_hdfs3") \
.load()
No, I just made sure I'm not doing it.
changed the path in .start() to another path and the same still occurs.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail:
I'm trying to read json messages from kafka and store them in hdfs with spark
structured streaming.
I followed the example here:
https://spark.apache.org/docs/2.1.0/structured-streaming-kafka-integration.html
and when my code looks like this:
df = spark \
.read \