On Mon, Nov 10, 2008 at 6:21 PM, Curtis Leach <[EMAIL PROTECTED]> wrote:
> I'm using a Perl 5.8.8 build on a Windows Server platform and I need to
> work with XML files.
>
> Is there a preferred module for use when working with XML in Perl?  I'm
> new to XML & I'm just looking for directions to XML modules to look at
> so I don't waste too much time going down the wrong path while wading
> through all the available XML modules on CPAN.
>
> I have a few example programs using XML::Simple (v2.14)for parsing out a
> couple of config values from some small XML files and returning them to
> the caller, but I'm going to need to programmatically edit more complex
> XML files that are considerably larger & save the results.  With the
> potential of the files being edited being huge.  But maybe 10,000 to
> 15,000 data points to update being average.
>
> I'm going to see multiple entries such as:
>
> <tag22 rt="abcd" seg="22" opt="xyz" />
> ....
> <tag22 rt="masd" seg="14" opt="zxy" />
>
> So I'll load the XML file into memory, validate it's well formed XML,
> and then locate every occurrence of tag22 to update it's optional "seg"
> attribute.  Making sure I only update a specific tag22 one time!  Once
> I've updated them all, I'll write the updated XML document back to disk.
>
> So does this sound like something XML::Simple can handle?  Or should I
> be looking at another XML module?  Are there tips to follow for making
> sure the generated file isn't to hard to look at by hand?  (The by hand
> part isn't a requirement, but past experience suggests taking an eyeball
> to something with your favorite editor can help a lot during
> troubleshooting.)

If the output XML is important XML::Smart really isn't the best
choice. My experience has been that it is a pain to define your XML
output structure with XML::Simple. It was designed to be a very simple
XML -> Perl mapper anything more advanced than that and (in my
opinion) you really should use a bigger API. Like Wayne says elsewhere
there are two types, DOM parsers that build a version of your document
in memory as an Object tree, and SAX parsers which treat your XML as a
"stream" that trigger call back events. The choice between them is
based on the size of your documents (will they fit within main memory?
I've heard of people having to parse >10G documents where you really
*can't* use a DOM parser) and your preferred method of work.

For DOM parsing I haven't really used much beyond XML::LibXML which is
a wrapper around the C libxml module. There are packages that provide
better sugar than the C-API but if you know one DOM you know them all.
For SAX Parsing I've used the XML::SAX module which is well
implemented and "stable" (in that I know the developers haven't found
bugs or required major changes to it in the last 6 years).

You should really look into the Perl + XML columns on XML.com (
http://www.xml.com/pub/at/15 ) which are old now but are still useful
and from what I can tell having just built an XML Toolkit for work
based on some of their ideas and code. Beyond that you might find the
Perl-XML FAQ ( http://perl-xml.sourceforge.net/faq/ ) useful too.

-Chris
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to