Since XMLLoader does not seem to satisfy your requirements, and assuming
each line contains an xml document (which is required by XmlLoader
anyway iirc) what you can do is write a simple udf to handle this.
Use a line reader as loadfunc, and write a udf which parses the input
line as a Document, extra what you need.
A = load '' using TextLoader('file') as (input_line:chararray);
B = foreach A generate flatten( my_udf(input_line) ) as (...);
Regards,
Mridul
On Tuesday 22 February 2011 08:29 PM, Baraa Mohamad wrote:
Hi all
if I have the following XML file
<attr tag="00020000" vr="UL" len="4">180</attr>
<attr tag="00020001" vr="OB" len="2">00\01</attr>
*how I can read it using xmlloader, I mean how I can read for examlpe the
value of tag and vr which are inside the attr attribute *?
I already wrote the following
A = load 'dicoms/' using org.apache.pig.piggybank.storage.XMLLoader('attr')
as (x:chararray);
But that will consider all the line as a chararray so how i can read the
values of tag, vr and attr ??
best regards