Stefano Mazzocchi wrote: > Kimble Young wrote: >> Hi, >> >> I've looking at Solvent and Crowbar for doing some of my own mashups but >> using my own APIs, database etc. Crowbar is very >> promising and I had some luck with it initially but it seems that I've >> hit a wall with multiple page scraping. >> >> Multi-page scraping would be invaluable in making Crowbar and Solvent a >> very powerful solution for people who want to make their own mashups >> outside of the environment provided by Piggybank. >> >> Do we know what's involved in making multi-page scraping happen? Is it a >> complex solution? > > Crowbar can do anything that Piggy Bank + Solvent can do and multi-page > scraping has been in Piggy Bank for quite some time.
Actually, it can't run a multi-page scraper. We have actively tried to make it work; the multi-page part should silently open up a new, hidden frame in the Crowbar URL display and proceed with scraping there, but for whatever reason it exits instead. We had a contributor who was looking into improving Crowbar. His patches made it into Crowbar, improving its reliability; last I heard, he was going to look into the multi-page problem, but it's been some time since I heard from him. Before his patches, I had a vague idea of what was going on. With his patches, which I haven't had the opportunity to review in depth, I have even less of an idea - maybe the solution is complex, maybe there's a typo that needs fixing. I couldn't really say at this point. -- Ryan Lee [EMAIL PROTECTED] MIT CSAIL Research Staff http://simile.mit.edu/ http://people.csail.mit.edu/ryanlee/ _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
