thank you for your reply,
so can I do the same with java scripts,
and to be more clear, I have a folder with multiple xml files thatI want to
read and parse in order to extract some attributes (att1,att2) values ....

ex
< elem att1=452 att2=7587>elem1</elem>

thanks

On Wed, Sep 14, 2011 at 4:53 PM, <[email protected]> wrote:

> I do this:
>  define analyze_unif `analyze_unif_recs.py`
>    input  (stdin)
>    output (stdout USING PigStreaming(','))
>    ship   ('$scriptDir/analyze_unif_recs.py');
>
>  UnifLines  = load '$unif_xml'
>    using org.apache.pig.piggybank.storage.XMLLoader('REC')
>    as (doc:chararray);
>  UnifXmlByDocId = stream UnifLines through analyze_unif
>          as (docid   : int,
>              xml_comp: chararray
>              );
>
> where analyze_unif_recs.py is a python script I wrote that does the xml
> parsing, and org.apache.pig.piggybank.storage.XMLLoader('REC') finds the
> <REC> elements in the xml input, that are passed to my script.
>
>
> William F Dowling
> Sr Technical Specialist, Software Engineering
> Thomson Reuters
> 0 +1 215 823 3853
>
>
> -----Original Message-----
> From: Baraa Mohamad [mailto:[email protected]]
> Sent: Wednesday, September 14, 2011 10:41 AM
> To: [email protected]
> Subject: reading xml file within a UDF
>
> Hello,
> I have a question please
>
> How I can read a file in a UDF in pig
>
> ex:  A = load 'xmlFiles' using myXMLParser ( xmlfile)
>
> can I do something like that, so that I can parse the xml file using some
> java library
>
> thanks for your help
>
> Baraa
> --
>



--

Reply via email to