Re: Crowbar and multiple page scraping

Stefano Mazzocchi Mon, 01 Oct 2007 17:58:08 -0700

Kimble Young wrote:
> Hi,
> 
> I've looking at Solvent and Crowbar for doing some of my own mashups but
> using my own APIs, database etc. Crowbar is very
> promising and I had some luck with it initially but it seems that I've
> hit a wall with multiple page scraping.
> 
> Multi-page scraping would be invaluable in making Crowbar and Solvent a
> very powerful solution for people who want to make their own mashups
> outside of the environment provided by Piggybank.
> 
> Do we know what's involved in making multi-page scraping happen? Is it a
> complex solution?


Crowbar can do anything that Piggy Bank + Solvent can do and multi-page
scraping has been in Piggy Bank for quite some time.

Many of our screen scrapers available at

 http://simile.mit.edu/wiki/Category:Javascript_screen_scraper

work on multiple pages.

For example, the ACM scraper

http://simile.mit.edu/wiki/ACM_Portal_Scraper

-- 
Stefano Mazzocchi
Digital Libraries Research Group                 Research Scientist
Massachusetts Institute of Technology
E25-131, 77 Massachusetts Ave               skype: stefanomazzocchi
Cambridge, MA  02139-4307, USA         email: stefanom at mit . edu
-------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Re: Crowbar and multiple page scraping

Reply via email to