Re: Solvent Documentation

Ernest Martinez Sat, 04 Aug 2007 20:10:55 -0700

I have an application which uses forms to enter data, but also uses the same
forms for search results. I was hoping that I could scrape the fielded
information.


Thanks
Ernie

P.S. I'd be happy to help if I can with respect to your Wiki.

Ernie

On 8/3/07, Stefano Mazzocchi <[EMAIL PROTECTED]> wrote:
>
> Ernest Martinez wrote:
> > Thanks,
> >
> > A bit of a flub on my part, I didnt generate the metadatafile. Once I
> > had that, it all became obvious.
>
> Cool... maybe you could help us out by writing your experience on the
> wiki, hint, hint ;-)
>
> > But that brought up another question.
> >
> > Is there anyway to create a scraper to work against web forms?
>
> I'm not sure I understand what you mean by that. Can you elaborate more?
>
> > Ernie
> >
> > On 8/3/07, *Stefano Mazzocchi* <[EMAIL PROTECTED]
> > <mailto:[EMAIL PROTECTED]>> wrote:
> >
> >     Ernest Martinez wrote:
> >     > I have justinstalled piggybank and solvent. I went through the
> solvent
> >     > tutorial in order to create a screen scraper for Craigslist.
> >     > Unfortunately the tutorial pretty much leaves one at a dead end
> with
> >     > respect to deploying the scraper.
> >
> >     Ehm, <blush/>, yeah, we really gotta fix that. (and then we wonder
> why
> >     nobody writes scrapers for piggy bank, maybe they don't know what to
> do
> >     with them once they wrote them!)
> >
> >     > I ended up with a javascript file but
> >     > I dont have a webserver to deply to. Nor do I know exactly how to
> >     deploy
> >     > this particular file. I understand that to install scrapers you
> >     need to
> >     > goto a web page, but there is no explanation on how to create that
> >     page.
> >
> >     We do it by posting them on our wiki.
> >
> >     http://simile.mit.edu/wiki/Category:Javascript_screen_scraper
> >
> >     Each scraper on our wiki has some RDF embedded in the wikitext. See,
> >     for
> >     example:
> >
> >     http://simile.mit.edu/wiki/Orkut_Friends_Scraper
> >
> >     in the "formal semantic markup" section.
> >
> >     You can create a new wiki page and then copy the existing wikitext.
> The
> >     hack is to use another wiki page to contain the scraper code and
> then
> >     point the semantic markup of the description page to the other page,
> but
> >     calling the "raw" parameter (so that mediawiki returns the
> javascript
> >     and not an HTML page about it). For the example above is
> >
> >     http://simile.mit.edu/wiki/Orkut_Friends_Scraper_Script?action=raw
> >
> >     Feel free to use our wiki for your new scraper.
> >
> >     > Also is there any way to open an exsiting scraper in case I want
> >     to make
> >     > corrections?
> >
> >     No, not yet. We did it by cut/pasting from another file. But I agree
> it
> >     would be a useful thing to have.
> >
> >     > I never checked the box to deploy to a webserver.
> >
> >     We do plan to make it *a lot* easier to load and save scrapers from
> >     Solvent but we have no ETA on that.
> >
> >     --
> >     Stefano Mazzocchi
> >     Digital Libraries Research Group                 Research Scientist
> >     Massachusetts Institute of Technology
> >     E25-131, 77 Massachusetts Ave               skype: stefanomazzocchi
> >     Cambridge, MA  02139-4307, USA         email: stefanom at mit . edu
> >     -------------------------------------------------------------------
> >
> >     _______________________________________________
> >     General mailing list
> >     [email protected] <mailto:[email protected]>
> >     http://simile.mit.edu/mailman/listinfo/general
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > General mailing list
> > [email protected]
> > http://simile.mit.edu/mailman/listinfo/general
>
>
> --
> Stefano.
>
> _______________________________________________
> General mailing list
> [email protected]
> http://simile.mit.edu/mailman/listinfo/general
>

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Re: Solvent Documentation

Reply via email to