Chad Armstrong wrote: > Thanks Thomas, > No no, I'm not wed to Template::Extract at all, but the reason I was > drawn to it is because I am going to be doing a lot of scraping for a > project and wanted to be able to externalize the template for the > various target pages, rather than embedding it for a particular page > format. Do you know of any other modules that might be able to > accomplish this? My goal is basically to extract certain data and > create an rss feed given a URL.
Actually, I do that kind of thing a lot. I use XML::LibXML to parse URLs, and Template Toolkit to format the data into RSS, HTML snippets for a portal, etc. Anyway, I find XPath expressions to be very nice because they are simple strings and therefore externalizable (place into a config file, database, etc) so that maintenance is simpler. This is useful when the site's layout changes, which it will inevitably do from time to time. Also, I often find that just one well-crafted XPath expression is all that is needed to extract exactly what you want. If there's something else as convenient as XPath for parsing HTML, I haven't found it yet. - Mark. _______________________________________________ templates mailing list [email protected] http://lists.template-toolkit.org/mailman/listinfo/templates
