Re: Solvent Documentation

Ernest Martinez Fri, 03 Aug 2007 10:01:21 -0700

Thanks,

A bit of a flub on my part, I didnt generate the metadatafile. Once I had
that, it all became obvious.


But that brought up another question.

Is there anyway to create a scraper to work against web forms?

Ernie

On 8/3/07, Stefano Mazzocchi <[EMAIL PROTECTED]> wrote:
>
> Ernest Martinez wrote:
> > I have justinstalled piggybank and solvent. I went through the solvent
> > tutorial in order to create a screen scraper for Craigslist.
> > Unfortunately the tutorial pretty much leaves one at a dead end with
> > respect to deploying the scraper.
>
> Ehm, <blush/>, yeah, we really gotta fix that. (and then we wonder why
> nobody writes scrapers for piggy bank, maybe they don't know what to do
> with them once they wrote them!)
>
> > I ended up with a javascript file but
> > I dont have a webserver to deply to. Nor do I know exactly how to deploy
> > this particular file. I understand that to install scrapers you need to
> > goto a web page, but there is no explanation on how to create that page.
>
> We do it by posting them on our wiki.
>
> http://simile.mit.edu/wiki/Category:Javascript_screen_scraper
>
> Each scraper on our wiki has some RDF embedded in the wikitext. See, for
> example:
>
> http://simile.mit.edu/wiki/Orkut_Friends_Scraper
>
> in the "formal semantic markup" section.
>
> You can create a new wiki page and then copy the existing wikitext. The
> hack is to use another wiki page to contain the scraper code and then
> point the semantic markup of the description page to the other page, but
> calling the "raw" parameter (so that mediawiki returns the javascript
> and not an HTML page about it). For the example above is
>
> http://simile.mit.edu/wiki/Orkut_Friends_Scraper_Script?action=raw
>
> Feel free to use our wiki for your new scraper.
>
> > Also is there any way to open an exsiting scraper in case I want to make
> > corrections?
>
> No, not yet. We did it by cut/pasting from another file. But I agree it
> would be a useful thing to have.
>
> > I never checked the box to deploy to a webserver.
>
> We do plan to make it *a lot* easier to load and save scrapers from
> Solvent but we have no ETA on that.
>
> --
> Stefano Mazzocchi
> Digital Libraries Research Group                 Research Scientist
> Massachusetts Institute of Technology
> E25-131, 77 Massachusetts Ave               skype: stefanomazzocchi
> Cambridge, MA  02139-4307, USA         email: stefanom at mit . edu
> -------------------------------------------------------------------
>
> _______________________________________________
> General mailing list
> [email protected]
> http://simile.mit.edu/mailman/listinfo/general
>

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Re: Solvent Documentation

Reply via email to