Ryan Lee wrote:
> Stefano Mazzocchi wrote:
>> Kimble Young wrote:
>>> Hi,
>>>
>>> I've looking at Solvent and Crowbar for doing some of my own mashups but
>>> using my own APIs, database etc. Crowbar is very
>>> promising and I had some luck with it initially but it seems that I've
>>> hit a wall with multiple page scraping.
>>>
>>> Multi-page scraping would be invaluable in making Crowbar and Solvent a
>>> very powerful solution for people who want to make their own mashups
>>> outside of the environment provided by Piggybank.
>>>
>>> Do we know what's involved in making multi-page scraping happen? Is it a
>>> complex solution?
>> Crowbar can do anything that Piggy Bank + Solvent can do and multi-page
>> scraping has been in Piggy Bank for quite some time.
> 
> Actually, it can't run a multi-page scraper.  We have actively tried to 
> make it work; the multi-page part should silently open up a new, hidden 
> frame in the Crowbar URL display and proceed with scraping there, but 
> for whatever reason it exits instead.
> 
> We had a contributor who was looking into improving Crowbar.  His 
> patches made it into Crowbar, improving its reliability; last I heard, 
> he was going to look into the multi-page problem, but it's been some 
> time since I heard from him.
> 
> Before his patches, I had a vague idea of what was going on.  With his 
> patches, which I haven't had the opportunity to review in depth, I have 
> even less of an idea - maybe the solution is complex, maybe there's a 
> typo that needs fixing.  I couldn't really say at this point.

D'oh, I stand corrected... maybe I should stop having so many balls to
juggle that I don't even remember the state their's in :-)

-- 
Stefano Mazzocchi
Digital Libraries Research Group                 Research Scientist
Massachusetts Institute of Technology
E25-131, 77 Massachusetts Ave               skype: stefanomazzocchi
Cambridge, MA  02139-4307, USA         email: stefanom at mit . edu
-------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to