use org.apache.pig.piggybank.storage.XMLLoader and then extract them using regex_all
On Thu, Sep 12, 2013 at 11:18 AM, jamal sasha <[email protected]> wrote: > Umm.. yess.. but how do i generalize it.. > so what I am looking for is.. just like we have json parser in say java > If i give a valid json string.. I can parse it as and then i can access it > as a hashmap.. > But in xml loader.. i still have to specify regex rules?? > > Actually, is it possible to just flatten the xml.. > so for example > convert > <aux> > <foobar>1</foobar> > <fushbar>foo</fushbar> > </aux> > to > <aux><foobar>1</foobar><fushbar>foo</fushbar></aux> > ??? > > > > > On Wed, Sep 11, 2013 at 10:32 PM, Jagat Singh <[email protected]> > wrote: > > > Use piggybank xmlloader > > On 12/09/2013 10:14 AM, "jamal sasha" <[email protected]> wrote: > > > > > Hi, > > > So I have different xml data sources...For example: > > > > > > src1.txt > > > > > > <foo> > > > <bar>1</bar> > > > </foo> > > > <foo> > > > <bar>2</bar> > > > </foo> > > > .. and so on > > > > > > > > > and another data > > > > > > src2.txt > > > > > > <aux> > > > <foobar>1</foobar> > > > <fushbar>foo</fushbar> > > > </aux> > > > > > > ... and so on > > > > > > > > > So basicaly different xml (valid formats) > > > > > > Rather than writing different pig scripts.. is there a way to write 1 > > > script and then convert all these xml data into csv? > > > Thanks > > > > > > -- *Thanks & Regards,* *S. Ajay Kumar +91-9966159106*
