Hi

Not sure if I follow your issue. Can you please post output of
books_inexp.show()?

On Thu, Jun 29, 2017 at 2:30 PM, Talap, Amol <amol.ta...@capgemini.com>
wrote:

> Hi:
>
>
>
> We are trying to parse XML data to get below output from given input
> sample.
>
> Can someone suggest a way to pass one DFrames output into load() function
> or any other alternative to get this output.
>
>
>
> Input Data from Oracle Table XMLBlob:
>
> *SequenceID*
>
> *Name*
>
> *City*
>
> *XMLComment*
>
> 1
>
> Amol
>
> Kolhapur
>
> <books><Comments><Comment><Title>Title1.1</Title><
> Description>Description_1.1</Description><Comment><Title>
> Title1.2</Title><Description>Description_1.2</Description><
> Comment><Title>Title1.3</Title><Description>Description_1.3</Description><
> /Comment></Comments></books>
>
> 2
>
> Suresh
>
> Mumbai
>
> <books><Comments><Comment><Title>Title2</Title><
> Description>Description_2</Description></Comment></Comments></books>
>
> 3
>
> Vishal
>
> Delhi
>
> <books><Comments><Comment><Title>Title3</Title><
> Description>Description_3</Description></Comment></Comments></books>
>
> 4
>
> Swastik
>
> Bangalore
>
> <books><Comments><Comment><Title>Title4</Title><
> Description>Description_4</Description></Comment></Comments></books>
>
>
>
> Output Data Expected using Spark SQL:
>
> *SequenceID*
>
> *Name*
>
> *City*
>
> *Title*
>
> *Description*
>
> 1
>
> Amol
>
> Kolhapur
>
> Title1.1
>
> Description_1.1
>
> 1
>
> Amol
>
> Kolhapur
>
> Title1.1
>
> Description_1.2
>
> 1
>
> Amol
>
> Kolhapur
>
> Title1.3
>
> Description_1.3
>
> 2
>
> Suresh
>
> Mumbai
>
> Title2
>
> Description_2
>
> 3
>
> Vishal
>
> Delhi
>
> Title3.1
>
> Description_3.1
>
> 4
>
> Swastik
>
> Bangalore
>
> Title4
>
> Description_4
>
>
>
> I am able to parse single XML using below approach in spark-shell using
> example below but how do we apply the same recursively for all rows ?
>
> https://community.hortonworks.com/questions/71538/parsing-
> xml-in-spark-rdd.html.
>
>
> val dfX = sqlContext.read.format("com.databricks.spark.xml").option(
> "rowTag","book").load("books.xml")
>
> val xData = dfX.registerTempTable("books")
>
> dfX.printSchema()
>
> val books_inexp =sqlContext.sql("select title,author from books where
> price<10")
>
> books_inexp.show
>
>
>
> Regards,
>
> Amol
> This message contains information that may be privileged or confidential
> and is the property of the Capgemini Group. It is intended only for the
> person to whom it is addressed. If you are not the intended recipient, you
> are not authorized to read, print, retain, copy, disseminate, distribute,
> or use this message or any part thereof. If you receive this message in
> error, please notify the sender immediately and delete all copies of this
> message.
>



-- 
Best Regards,
Ayan Guha

Reply via email to