I use phantomJs to login/scrape data from a site, then I put that data in a
db.

It runs a full browser, with javascript events, ajax, etc.

1. login to a secure site, navigate to a list of orders
2. pull down data for each order and write it to a file
3. process each file and write to the database.

I use phantom for step 1 and step 2, and have multiple workers executing at
the same time (because of I/O and network delays)

I have a separate ruby process that picks up the new files and does the
processing. (and I can fire up multiple workers of this type if I need to)

CasperJs <http://casperjs.org/> is a nice addition that can make writing
your scripts even easier



On Tue, Oct 15, 2013 at 10:52 AM, Jack R-G <[email protected]> wrote:

> PhantomJS looks interesting.  Can it 1) Read/write a Postgres database,
> and 2) Fill in form fields and submit forms, with the javascript associated
> with the various events firing automatically.  I looked at the PhantomJS
> website but didn't find any examples of these uses; it would be helpful if
> you could point me at examples.
>
> If PhantomJS cannot access Postgres, is there a typical usage pattern for
> PhantomJS code that uses it as a helper to retrieve pages and return them
> to the caller for further processing?
>
>
> On Monday, October 14, 2013 11:38:07 AM UTC-7, j_McCaffrey wrote:
>
>> I can't address your specific situation, but can recommend phantomjs
>> instead
>>
>> https://github.com/stomita/**heroku-buildpack-phantomjs<https://github.com/stomita/heroku-buildpack-phantomjs>
>>
>>
>> On Mon, Oct 14, 2013 at 1:12 PM, Jack Royal-Gordon <[email protected]>wrote:
>>
>>> I'm trying to scrape some websites that rely on Javascript, so I found this
>>> article<http://stackoverflow.com/questions/11494994/is-it-possible-to-plug-a-javascript-engine-with-ruby-and-nokogiri>
>>>  discussing
>>> Watir and headless processing.  I ran into the following
>>> exception: Headless::**Exception: Xvfb not found on your system.  So I
>>> started researching Xvfb and discovered that it is a stand-alone display
>>> server.  So I started looking into how to get it installed on Heroku,
>>> I found a gist <https://gist.github.com/atduskgreg/5100799> where the
>>> author is discussing building a static linked binary, but he doesn't really
>>> come to a successful conclusion.  I've also seen mention of using a custom
>>> build-pack, but nothing definite there either.
>>>
>>> Does anyone have experience with this and can offer some advice on how
>>> to proceed?
>>>
>>> --
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Heroku" group.
>>>
>>> To unsubscribe from this group, send email to
>>> heroku+un...@**googlegroups.com
>>>
>>> For more options, visit this group at
>>> http://groups.google.com/**group/heroku?hl=en_US?hl=en<http://groups.google.com/group/heroku?hl=en_US?hl=en>
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Heroku Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to heroku+un...@**googlegroups.com.
>>>
>>> For more options, visit 
>>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
>>> .
>>>
>>
>>
>>
>> --
>> Thanks,
>> -John
>>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "Heroku" group.
>
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en_US?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Heroku Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>



-- 
Thanks,
-John

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Heroku" group.

To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"Heroku Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to