Hello Experts,

I’m using spark-xml package which is automatically inferring my schema and 
creating a DataFrame. 

I’m extracting few fields like id, name (which are unique) from below xml, but 
my requirement is to store entire XML in one of the column as well. I’m writing 
this data to AVRO hive table. Can anyone tell me how to achieve this? 

Example XML and expected output is given below.

Sample XML:
<emplist>
<emp>
   <manager>
   <id>1</id>
   <name>foo</name>
    <subordinates>
      <clerk>
        <cid>1</cid>
        <cname>foo</cname>
      </clerk>
      <clerk>
        <cid>1</cid>
        <cname>foo</cname>
      </clerk>
    </subordinates>
   </manager>
</emp>
</emplist>
 
Expected output:
id, name, XML
1, foo, <emplist> ….</emplist>
 
Thanks,
Sreekanth Jella
 

Reply via email to