Re: Piggy Bank automation

Ryan Lee Fri, 30 Nov 2007 09:13:47 -0800

On its own, no; Piggy Bank scraper scripts depend on browser objects and 
the DOM API, and Rhino doesn't provide either.


Kent Gibson wrote:
> is there a way to hook up the java javascript interpreter (rhino) to do this?
> 
> ----- Original Message ----
> From: Ryan Lee <[EMAIL PROTECTED]>
> To: General List <[email protected]>
> Sent: Tuesday, November 27, 2007 6:09:39 PM
> Subject: Re: Piggy Bank automation
> 
> 
> Yuriy Zubarev wrote:
>> Greetings,
>>
>> I was wondering if there is a way to utilize Piggy Bank's screen
>> scraping capabilities in an automated fashion. For example, I would
>  like
>> to have a process that scans changes and collects information from
>> different web sites every 6 hours or so and then saves the normalized
>> information into a persistent storage (database, file, etc). This
>> process, I would imagine, would control an instance(s) of Firefox
>> browser and send it information on what site to visit.
>>
>> Thank you,
>> Yuriy
> 
> Hi Yuriy,
> 
> There isn't an exact solution, but you may want to look at
> 
>    http://simile.mit.edu/crowbar/
> 
> or
> 
>    http://simile.mit.edu/wiki/Fresno
> 
> Crowbar is its own XUL application and can presently do single page 
> scraping using Piggy Bank scrapers, writing results to stdout.
>   Multiple 
> pages are not operational; we haven't tracked down why yet.  Fresno 
> allows you to interact with the Javascript interpreter in a running 
> Firefox via the MozRepl add-on; we haven't adapted it specifically for 
> running PB scrapers, however.
> 


-- 
Ryan Lee                  [EMAIL PROTECTED]
MIT CSAIL Research Staff  http://simile.mit.edu/
http://people.csail.mit.edu/ryanlee/
_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Re: Piggy Bank automation

Reply via email to