Re: How to read json data from kafka and store to hdfs with spark structued streaming?

2018-07-27 Thread Arbab Khalil
Please try adding an other option of starting offset. I have done the same thing many times with different versions of spark that supports structured streaming. The other I am seeing is could be something that it could be at write time. Can you please confirm it be doing printSchema function after

Re: How to read json data from kafka and store to hdfs with spark structued streaming?

2018-07-27 Thread dddaaa
This is a mistake in the code snippet I posted. The right code that is actually running and producing the error is: / df = spark \ .readStream \ .format("kafka") \ .option("kafka.bootstrap.servers", "kafka_broker") \ .option("subscribe", "test_hdfs3") \ .load()

Re: How to read json data from kafka and store to hdfs with spark structued streaming?

2018-07-27 Thread Arbab Khalil
Why are you reading batch from kafka and writing it as stream? On Fri, Jul 27, 2018, 1:40 PM dddaaa wrote: > No, I just made sure I'm not doing it. > changed the path in .start() to another path and the same still occurs. > > > > -- > Sent from:

Re: How to read json data from kafka and store to hdfs with spark structued streaming?

2018-07-27 Thread dddaaa
No, I just made sure I'm not doing it. changed the path in .start() to another path and the same still occurs. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail:

Re: How to read json data from kafka and store to hdfs with spark structued streaming?

2018-07-26 Thread Tathagata Das
Are you writing multiple streaming query output to the same location? If so, I can see this error occurring. Multiple streaming queries writing to the same directory is not supported. On Tue, Jul 24, 2018 at 3:38 PM, dddaaa wrote: > I'm trying to read json messages from kafka and store them in

How to read json data from kafka and store to hdfs with spark structued streaming?

2018-07-24 Thread dddaaa
I'm trying to read json messages from kafka and store them in hdfs with spark structured streaming. I followed the example here: https://spark.apache.org/docs/2.1.0/structured-streaming-kafka-integration.html and when my code looks like this: df = spark \ .read \