Re: Solvent Tutorial for Scraping Multiple pages

David Morris Mon, 19 Feb 2007 20:01:01 -0800

I'm finally getting to try this screen scraping thing out, and I'm not sure
how to use the pattern matching function to scrape multiple pages.


I see this code on one of the sample scrapers:

match: 
"^http://www\.vacancyguide\.com/rentals/search\.cfm\?.*$<http://www%5c.vacancyguide%5c.com/rentals/search%5C.cfm%5C?.*$>
"

But I don't see that anywhere inside the actual scraper code. The site I'm
trying to scrape has a pretty simple url pattern, but I don't know where to
put it into the code:

http://www.newfarm.com/farmlocator/farm_detail.php?ID=#

where # is a number from 1 to 1200 or so. Can anyone point me in the right
direction here? Thanks,

Dave

On 2/6/07, Ben Hyde <[EMAIL PROTECTED]> wrote:


On Feb 6, 2007, at 5:40 PM, Keith Alexander wrote:
> David Morris wrote:
>> Is there a tutorial yet on how to scrape a site with multiple pages
>> yet? I would like to scrape a site like allrecipes.com
>> <http://allrecipes.com> or epicurious.com <http://epicurious.com>,
>> and
>> I can't figure out what to do to scrape multiple pages.
> In the script templates under "Insert" on the left hand panel,
> there is
> a menu item called Code to Scrape several pages.
>
> The main point is this function:     piggybank.scrapeURL(url,
> scrapePage, failure);

and assorted examples here:
    http://simile.mit.edu/wiki/Category:Javascript_screen_scraper
_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Re: Solvent Tutorial for Scraping Multiple pages

Reply via email to