*Cc:* Michael Armbrust; user
*Subject:* Re: get corrupted rows using columnNameOfCorruptRecord
Let me please just extend the suggestion a bit more verbosely.
I think you could try something like this maybe.
val jsonDF = spark.read
.option("columnNameOfCorruptRecord", "xxx&quo
Analysis(Analyzer.scala:58)
>
> at org.apache.spark.sql.execution.QueryExecution.
> assertAnalyzed(QueryExecution.scala:49)
>
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
>
> at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$
> withPlan(Dataset.scala
06, 2016 10:26 PM
*To:* Yehuda Finkelstein
*Cc:* user
*Subject:* Re: get corrupted rows using columnNameOfCorruptRecord
.where("xxx IS NOT NULL") will give you the rows that couldn't be parsed.
On Tue, Dec 6, 2016 at 6:31 AM, Yehuda Finkelstein <
yeh...@veracity-group.com&
.where("xxx IS NOT NULL") will give you the rows that couldn't be parsed.
On Tue, Dec 6, 2016 at 6:31 AM, Yehuda Finkelstein <
yeh...@veracity-group.com> wrote:
> Hi all
>
>
>
> I’m trying to parse json using existing schema and got rows with NULL’s
>
> //get schema
>
> val df_schema = spark.sqlC
Hi all
I’m trying to parse json using existing schema and got rows with NULL’s
//get schema
val df_schema = spark.sqlContext.sql("select c1,c2,…cn t1 limit 1")
//read json file
val f = sc.textFile("/tmp/x")
//load json into data frame using schema
var df =
spark.sqlContext.read.option("co