Ernest Martinez wrote:
> I have justinstalled piggybank and solvent. I went through the solvent
> tutorial in order to create a screen scraper for Craigslist.
> Unfortunately the tutorial pretty much leaves one at a dead end with
> respect to deploying the scraper.

Ehm, <blush/>, yeah, we really gotta fix that. (and then we wonder why
nobody writes scrapers for piggy bank, maybe they don't know what to do
with them once they wrote them!)

> I ended up with a javascript file but
> I dont have a webserver to deply to. Nor do I know exactly how to deploy
> this particular file. I understand that to install scrapers you need to
> goto a web page, but there is no explanation on how to create that page.

We do it by posting them on our wiki.

http://simile.mit.edu/wiki/Category:Javascript_screen_scraper

Each scraper on our wiki has some RDF embedded in the wikitext. See, for
example:

http://simile.mit.edu/wiki/Orkut_Friends_Scraper

in the "formal semantic markup" section.

You can create a new wiki page and then copy the existing wikitext. The
hack is to use another wiki page to contain the scraper code and then
point the semantic markup of the description page to the other page, but
calling the "raw" parameter (so that mediawiki returns the javascript
and not an HTML page about it). For the example above is

http://simile.mit.edu/wiki/Orkut_Friends_Scraper_Script?action=raw

Feel free to use our wiki for your new scraper.

> Also is there any way to open an exsiting scraper in case I want to make
> corrections? 

No, not yet. We did it by cut/pasting from another file. But I agree it
would be a useful thing to have.

> I never checked the box to deploy to a webserver.

We do plan to make it *a lot* easier to load and save scrapers from
Solvent but we have no ETA on that.

-- 
Stefano Mazzocchi
Digital Libraries Research Group                 Research Scientist
Massachusetts Institute of Technology
E25-131, 77 Massachusetts Ave               skype: stefanomazzocchi
Cambridge, MA  02139-4307, USA         email: stefanom at mit . edu
-------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to