On Tue, Dec 9, 2008 at 8:09 PM, Chris Cosner <[EMAIL PROTECTED]> wrote:
> Question: What is the speediest tool to pull data from an xml feed that will
> only be a few hundred lines at most? Some regexes will be necessary.
>
> Context:
> I am playing with the google books data api. They provide a feed, which you
> can see an example of here:
> http://code.google.com/apis/books/docs/gdata/developers_guide_protocol.html
> (scroll about halfway down)
>
> I can send search terms to the api and get back some information about the
> first three results in Google Book Search to integrate with our own search
> results. [Done] So in some cases the user may click through to GBS, and in
> others stay on our site. The GBS feed duplicates some tags, such as
> "dc:identifier" and the only way to distinguish them will be with a regex on
> the contents, or by noting tag order.
>
> With the CPAN module XML::XSLT I am able to transform this pretty rapidly. I
> tried using XML::Twig, but it seemed too slow for this purpose.
>
> However, XML::XSLT does not support regexes.
>
> So I expect that I'll just have to transform the text as far as possible
> with XML::XSLT and the use Perl directly to finish the job.

I would avoid regexes when working with XML.  They break much too
easily.  Stick to XML tools.  There are any number of XML parsers
available at CPAN.

Sean

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to