Hi,
I know it, but my purpose it to transforming json string in DataSet
to Dataset, while spark.readStream can only support read json file in
specified path.
https://stackoverflow.com/questions/48617474/how-to-convert-json-dataset-to-dataframe-in-spark-structured-streaming
gives an essential method
Have you read through the documentation of Structured Streaming?
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
One of the basic mistakes you are making is defining the dataset as with
`spark.read()`. You define a streaming Dataset as `spark.readStream()`
On Thu,
Hi, Tathagata
I have tried structured streaming, but in line
> Dataset rowDataset = spark.read().json(jsondataset);
Always throw
> Queries with streaming sources must be executed with writeStream.start()
But what i need to do in this step is only transforming json string data to
Dataset . Ho
It's not very surprising that doing this sort of RDD to DF conversion
inside DStream.foreachRDD has weird corner cases like this. In fact, you
are going to have additional problems with partial parquet files (when
there are failures) in this approach. I strongly suggest that you use
Structured Stre