On Mon, Nov 10, 2008 at 6:21 PM, Curtis Leach <[EMAIL PROTECTED]> wrote: > I'm using a Perl 5.8.8 build on a Windows Server platform and I need to > work with XML files. > > Is there a preferred module for use when working with XML in Perl? I'm > new to XML & I'm just looking for directions to XML modules to look at > so I don't waste too much time going down the wrong path while wading > through all the available XML modules on CPAN. > > I have a few example programs using XML::Simple (v2.14)for parsing out a > couple of config values from some small XML files and returning them to > the caller, but I'm going to need to programmatically edit more complex > XML files that are considerably larger & save the results. With the > potential of the files being edited being huge. But maybe 10,000 to > 15,000 data points to update being average. > > I'm going to see multiple entries such as: > > <tag22 rt="abcd" seg="22" opt="xyz" /> > .... > <tag22 rt="masd" seg="14" opt="zxy" /> > > So I'll load the XML file into memory, validate it's well formed XML, > and then locate every occurrence of tag22 to update it's optional "seg" > attribute. Making sure I only update a specific tag22 one time! Once > I've updated them all, I'll write the updated XML document back to disk. > > So does this sound like something XML::Simple can handle? Or should I > be looking at another XML module? Are there tips to follow for making > sure the generated file isn't to hard to look at by hand? (The by hand > part isn't a requirement, but past experience suggests taking an eyeball > to something with your favorite editor can help a lot during > troubleshooting.)
If the output XML is important XML::Smart really isn't the best choice. My experience has been that it is a pain to define your XML output structure with XML::Simple. It was designed to be a very simple XML -> Perl mapper anything more advanced than that and (in my opinion) you really should use a bigger API. Like Wayne says elsewhere there are two types, DOM parsers that build a version of your document in memory as an Object tree, and SAX parsers which treat your XML as a "stream" that trigger call back events. The choice between them is based on the size of your documents (will they fit within main memory? I've heard of people having to parse >10G documents where you really *can't* use a DOM parser) and your preferred method of work. For DOM parsing I haven't really used much beyond XML::LibXML which is a wrapper around the C libxml module. There are packages that provide better sugar than the C-API but if you know one DOM you know them all. For SAX Parsing I've used the XML::SAX module which is well implemented and "stable" (in that I know the developers haven't found bugs or required major changes to it in the last 6 years). You should really look into the Perl + XML columns on XML.com ( http://www.xml.com/pub/at/15 ) which are old now but are still useful and from what I can tell having just built an XML Toolkit for work based on some of their ideas and code. Beyond that you might find the Perl-XML FAQ ( http://perl-xml.sourceforge.net/faq/ ) useful too. -Chris _______________________________________________ ActivePerl mailing list ActivePerl@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs