Re: pyspark + from_json(col("col_name"), schema) returns all null
Hi, Not that I'm aware of, but in your case checking out whether a JSON message fit your schema and the pipeline would've taken pyspark alone with JSONs on disk, wouldn't it? Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski Spark Structured Streaming https://bit.ly/spark-structured-streaming Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Mon, Dec 11, 2017 at 12:49 AM, salemi wrote: > I found the root cause! There was mismatch between the StructField type and > the json message. > > > Is there a good write up / wiki out there that describes how to debug spark > jobs? > > > Thanks > > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
Re: pyspark + from_json(col("col_name"), schema) returns all null
I found the root cause! There was mismatch between the StructField type and the json message. Is there a good write up / wiki out there that describes how to debug spark jobs? Thanks -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
pyspark + from_json(col("col_name"), schema) returns all null
Hi All, I am using pyspark and consuming messages from Kafka and when I .select(from_json(col("col_name"), schema)) the return values are all null. I looked at the json messages and they are valid strings. any ideas? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org