On Fri, 27 Aug 2004 13:23:15 -0700 (PDT), J. Goodleaf <[EMAIL PROTECTED]> wrote: > Any ideas appreciated. > > I need to parse some really large (> 25MB) sgml files. The files are > just database dumps essentially, with each record looking something > like this: > <reportid> > <primarystuff>STUFF</primarystuff> > <secondarystuff> > stuff > </secondarystuff> > </primarystuff> > </reportid> > > The records are not overly complicated, but I've never tried XML or > SGML parsing before and am at a loss on how to approach it. The > files are way too big just to slurp in and play around with. Can I > set he record separator to '</reportid>'? Or is that a stupid way to > approach this? Suggestion/modules would be great. > > Thanks, > J > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > <http://learn.perl.org/> <http://learn.perl.org/first-response> > > J,
Have a look at http://perl-xml.sourceforge.net/faq/ for all the information you need about parsing XML with Perl. It is an introduction to each of the major XML modules with Perl. I recommend using XML::TreeBuilder as I was an XML newbie last week and after trying all of the them, it was the easiest to use. HTH, Kevin -- Kevin Old [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>