Hello Kant, See is the examples in this blog explains how to deal with your particular case: https://databricks.com/blog/2017/02/23/working-complex-data-formats-structured-streaming-apache-spark-2-1.html
Cheers Jules Sent from my iPhone Pardon the dumb thumb typos :) > On May 30, 2017, at 7:31 PM, kant kodali <kanth...@gmail.com> wrote: > > Hi All, > > I have a Dataset<Row> and I am trying to convert it into Dataset<String> > (json String) using Spark Structured Streaming. I have tried the following. > > df2.toJSON().writeStream().foreach(new KafkaSink()) > This doesn't seem to work for the following reason. > > "Queries with streaming sources must be executed with writeStream.start()" > > My dataframe has looks like this > > name, ratio, count // column names > > "hello", 1.56, 34 > > If I try to convert a Row into a Json String it results into something like > this {"key1", "name", "value1": "hello", "key2", "ratio", "value2": 1.56 , > "key3", "count", "value3": 34} but what I need is something like this { > result: {"name": "hello", "ratio": 1.56, "count": 34} } however I don't have > a result column. > > It looks like there are couple of functions to_json and json_tuple but they > seem to take only one Column as a first argument so should I call to_json on > every column? Also how would I turn this into DataSet<String> ? > > Thanks! > > >