I'm finally getting to try this screen scraping thing out, and I'm not sure how to use the pattern matching function to scrape multiple pages.
I see this code on one of the sample scrapers: match: "^http://www\.vacancyguide\.com/rentals/search\.cfm\?.*$<http://www%5c.vacancyguide%5c.com/rentals/search%5C.cfm%5C?.*$> " But I don't see that anywhere inside the actual scraper code. The site I'm trying to scrape has a pretty simple url pattern, but I don't know where to put it into the code: http://www.newfarm.com/farmlocator/farm_detail.php?ID=# where # is a number from 1 to 1200 or so. Can anyone point me in the right direction here? Thanks, Dave On 2/6/07, Ben Hyde <[EMAIL PROTECTED]> wrote:
On Feb 6, 2007, at 5:40 PM, Keith Alexander wrote: > David Morris wrote: >> Is there a tutorial yet on how to scrape a site with multiple pages >> yet? I would like to scrape a site like allrecipes.com >> <http://allrecipes.com> or epicurious.com <http://epicurious.com>, >> and >> I can't figure out what to do to scrape multiple pages. > In the script templates under "Insert" on the left hand panel, > there is > a menu item called Code to Scrape several pages. > > The main point is this function: piggybank.scrapeURL(url, > scrapePage, failure); and assorted examples here: http://simile.mit.edu/wiki/Category:Javascript_screen_scraper _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
