ok so the problem really was I was compiling with 2.1.0 jars and at run time supplying 2.1.1. once I changed to 2.1.1 at compile time as well it seem to work fine and I can see all my 75 fields.
On Thu, May 18, 2017 at 2:39 AM, kant kodali <kanth...@gmail.com> wrote: > Hi All, > > Here is my code. > > Dataset<Row> df = ds.select(functions.from_json(new > Column("value").cast("string"), getSchema()).as("payload")); > > Dataset<Row> df1 = df.selectExpr("payload.data.*"); > > StreamingQuery query = > df1.writeStream().outputMode("append").option("truncate", > "false").format("console").start(); > > query.awaitTermination(); > > > payload.data is a struct with 75 fields so the above code runs without > any errors however the output doesn't get printed out at all even after > waiting for 5 minutes. But if I select say 3 columns like > payload.data.col1, payload.data.col2, payload.data.col3 then things work > ok and I can see the data that is read from Kafka however it still doesn't > look instantaneous (It still takes about a minute or two if I select less > than 10 fields). wondering why nothing gets printed to console if I select > payload.data.* ? The overall payload size is 2KB. I am using Spark 2.1.1 > and I have enough free memory and disk space left on my machines and I am > running on a client mode. > > Thanks! > >