Chad Armstrong wrote:
> Thanks Thomas,
>   No no, I'm not wed to Template::Extract at all, but the reason I was
> drawn to it is because I am going to be doing a lot of scraping for a
> project and wanted to be able to externalize the template for the
> various target pages, rather than embedding it for a particular page
> format. Do you know of any other modules that might be able to
> accomplish this? My goal is basically to extract certain data and
> create an rss feed given a URL.

Actually, I do that kind of thing a lot. I use XML::LibXML to parse URLs,
and Template Toolkit to format the data into RSS, HTML snippets for a
portal, etc. 

Anyway, I find XPath expressions to be very nice because they are simple
strings and therefore externalizable (place into a config file, database,
etc) so that maintenance is simpler. This is useful when the site's layout
changes, which it will inevitably do from time to time. Also, I often find
that just one well-crafted XPath expression is all that is needed to extract
exactly what you want.

If there's something else as convenient as XPath for parsing HTML, I haven't
found it yet.

- Mark.


_______________________________________________
templates mailing list
[email protected]
http://lists.template-toolkit.org/mailman/listinfo/templates

Reply via email to