Re: Spark Structured Streaming is taking too long to process 2KB messages

kant kodali Thu, 18 May 2017 16:45:07 -0700

ok so the problem really was  I was compiling with 2.1.0 jars and at run
time supplying 2.1.1. once I changed to 2.1.1 at compile time as well it
seem to work fine and I can see all my 75 fields.


On Thu, May 18, 2017 at 2:39 AM, kant kodali <kanth...@gmail.com> wrote:

> Hi All,
>
> Here is my code.
>
> Dataset<Row> df = ds.select(functions.from_json(new 
> Column("value").cast("string"), getSchema()).as("payload"));
>
> Dataset<Row> df1 = df.selectExpr("payload.data.*");
>
> StreamingQuery query = 
> df1.writeStream().outputMode("append").option("truncate", 
> "false").format("console").start();
>
> query.awaitTermination();
>
>
> payload.data is a struct with 75 fields so the above code runs without
> any errors however the output doesn't get printed out at all even after
> waiting for 5 minutes. But if I select say 3 columns like
> payload.data.col1, payload.data.col2, payload.data.col3 then things work
> ok and I can see the data that is read from Kafka however it still doesn't
> look instantaneous (It still takes about a minute or two if I select less
> than 10 fields). wondering why nothing gets printed to console if I select
> payload.data.* ? The overall payload size is 2KB. I am using Spark 2.1.1
> and I have enough free memory and disk space left on my machines and I am
> running on a client mode.
>
> Thanks!
>
>

Re: Spark Structured Streaming is taking too long to process 2KB messages

Reply via email to