Re: spark-xml can't recognize schema

2016-02-21 Thread Dave Moyers
Make sure the xml input file is well formed (check your end tags). Sent from my iPhone > On Feb 21, 2016, at 8:14 AM, Prathamesh Dharangutte > wrote: > > This is the code I am using for parsing xml file: > > > > import org.apache.spark.{SparkConf,SparkContext} > import org.apache.spark.sq

Re: spark-xml can't recognize schema

2016-02-21 Thread Sebastian Piu
id you include ‎that in your xml file? > > *From: *Sebastian Piu > *Sent: *Sunday, 21 February 2016 20:00 > *To: *Prathamesh Dharangutte > *Cc: *user@spark.apache.org > *Subject: *Re: spark-xml can't recognize schema > > Just ran that code and it works fine, here is

Re: spark-xml can't recognize schema

2016-02-21 Thread Prathamesh Dharangutte
I am using spark 1.4.0 with scala 2.10.4  and 0.3.2 of spark-xmlOrderid is empty for some books and multiple entries of it for other books,did you include ‎that in your xml file?

Re: spark-xml can't recognize schema

2016-02-21 Thread Sebastian Piu
Just ran that code and it works fine, here is the output: What version are you using? val ctx = SQLContext.getOrCreate(sc) val df = ctx.read.format("com.databricks.spark.xml").option("rowTag", "book").load("file:///tmp/sample.xml") df.printSchema() root |-- name: long (nullable = true) |-- ord

Re: spark-xml can't recognize schema

2016-02-21 Thread Prathamesh Dharangutte
This is the code I am using for parsing xml file: import org.apache.spark.{SparkConf,SparkContext} import org.apache.spark.sql.{DataFrame,SQLContext} import com.databricks.spark.xml object XmlProcessing { def main(args : Array[String]) = { val conf = new SparkConf() .setAppName("

Re: spark-xml can't recognize schema

2016-02-21 Thread Sebastian Piu
Can you paste the code you are using? On Sun, 21 Feb 2016, 13:19 Prathamesh Dharangutte wrote: > I am trying to parse xml file using spark-xml. But for some reason when i > print schema it only shows root instead of the hierarchy. I am using > sqlcontext to read the data. I am proceeding accord

spark-xml can't recognize schema

2016-02-21 Thread Prathamesh Dharangutte
I am trying to parse xml file using spark-xml. But for some reason when i print schema it only shows root instead of the hierarchy. I am using sqlcontext to read the data. I am proceeding according to this video : https://www.youtube.com/watch?v=NemEp53yGbI The structure of xml file is somewhat l