Hi Not sure if I follow your issue. Can you please post output of books_inexp.show()?
On Thu, Jun 29, 2017 at 2:30 PM, Talap, Amol <amol.ta...@capgemini.com> wrote: > Hi: > > > > We are trying to parse XML data to get below output from given input > sample. > > Can someone suggest a way to pass one DFrames output into load() function > or any other alternative to get this output. > > > > Input Data from Oracle Table XMLBlob: > > *SequenceID* > > *Name* > > *City* > > *XMLComment* > > 1 > > Amol > > Kolhapur > > <books><Comments><Comment><Title>Title1.1</Title>< > Description>Description_1.1</Description><Comment><Title> > Title1.2</Title><Description>Description_1.2</Description>< > Comment><Title>Title1.3</Title><Description>Description_1.3</Description>< > /Comment></Comments></books> > > 2 > > Suresh > > Mumbai > > <books><Comments><Comment><Title>Title2</Title>< > Description>Description_2</Description></Comment></Comments></books> > > 3 > > Vishal > > Delhi > > <books><Comments><Comment><Title>Title3</Title>< > Description>Description_3</Description></Comment></Comments></books> > > 4 > > Swastik > > Bangalore > > <books><Comments><Comment><Title>Title4</Title>< > Description>Description_4</Description></Comment></Comments></books> > > > > Output Data Expected using Spark SQL: > > *SequenceID* > > *Name* > > *City* > > *Title* > > *Description* > > 1 > > Amol > > Kolhapur > > Title1.1 > > Description_1.1 > > 1 > > Amol > > Kolhapur > > Title1.1 > > Description_1.2 > > 1 > > Amol > > Kolhapur > > Title1.3 > > Description_1.3 > > 2 > > Suresh > > Mumbai > > Title2 > > Description_2 > > 3 > > Vishal > > Delhi > > Title3.1 > > Description_3.1 > > 4 > > Swastik > > Bangalore > > Title4 > > Description_4 > > > > I am able to parse single XML using below approach in spark-shell using > example below but how do we apply the same recursively for all rows ? > > https://community.hortonworks.com/questions/71538/parsing- > xml-in-spark-rdd.html. > > > val dfX = sqlContext.read.format("com.databricks.spark.xml").option( > "rowTag","book").load("books.xml") > > val xData = dfX.registerTempTable("books") > > dfX.printSchema() > > val books_inexp =sqlContext.sql("select title,author from books where > price<10") > > books_inexp.show > > > > Regards, > > Amol > This message contains information that may be privileged or confidential > and is the property of the Capgemini Group. It is intended only for the > person to whom it is addressed. If you are not the intended recipient, you > are not authorized to read, print, retain, copy, disseminate, distribute, > or use this message or any part thereof. If you receive this message in > error, please notify the sender immediately and delete all copies of this > message. > -- Best Regards, Ayan Guha