I have a requirement to parse an xml and generate columns based on parameters 
specified by the user to the pig script.

For eg,  consider the following xml
<school>
                <students>
                                <student>
                                                <name>test</test>
                                                <rno>1</rno>
                                                <rank>3</rank>
                                </student>
                                <student>
                                                <name>xyz</test>
                                                <rno>3</rno>
                                                <rank>2</rank>
                                </student>
                <students>
</school>

My requirement is to parse the xml and generate the attributes depending on the 
field names specified by the user.
For eg, if the user specifies the field name as 'name|rno' , the parser should 
parse the xml and return a tuple containing name and rno.

I am using XML Loader to parse the xml up to student and then have written a 
java UDF to parse the student xml.
I tried to define a parameterized constructor in my java UDF class wherein I 
pass the columns/ attributes to be parsed.
I have then overridden the outputSchema(Schema input) method , in which I fetch 
the column names and add new field schema.

However this does not work the way expected. Is there any way of getting this 
done?




DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

Reply via email to