hi,
I have some XML files with a structure like this:

<document>

   <header>some text</header>
   <record>record 1</record>
   <record>record 2</record>
   ....
   <record>record N</record>
<document>

Where the info in the header is necessary for processing the records. By
using Mahou's  XmlInputFormat I'm able to rescue every <record> but not the
info in the header, an option is not to split the file and process it as a
whole in the mapper, but sometimes the files are over 200MB and I belive it
would not be very efficient.So if any has some suggestion about how to
process this kind of files (the use can apply to any kind of files), I would
appreciate it!

PS: sorry for the duplicate email but I believe this is the correct list to
ask the question, not common-user.

thanks
-- 
Alejandro Montenegro del Pino.
ViƱa del Mar - Chile
phone: (+56) 9-68358690

Reply via email to