Hi, guys.

I want to scrap an HTML site which is using javascript to generate the
contents. So, I can't use mechanize gem or similar ones. I've tried
rdom and taka with johnson, but still some problems (I could give you
more details). The best and easiest option I have at the moment is to
use watir (or selenium or celerity for jruby). I've selected watir,
it's simple, the watir gem or even the watir-webdriver gem. I like
them. But I have two problems:

- I want to deploy the app in heroku but I get the error: "Could not
find Firefox binary (os=linux)".
- I don't know if it's possible to access to the watir logic without
the need of the browser binary (and without open it in background).

I currently have an answer here:
http://stackoverflow.com/questions/3597118/can-you-deploy-watir-on-heroku-to-generate-html-snapshots-if-so-how,
but I just wanted to confirm the options I have.

I write a watir-webdriver example, working well in local, to ilustrate
the simple process (in this case html is not dynamically generated, of
course, it's only an example):

  require "rubygems"
  require "watir-webdriver"
  require "watir-webdriver/extensions/wait"

  browser = Watir::Browser.new :firefox
  browser.goto "http://google.com";
  browser.text_field(:name, 'q').set "watir-webdriver"
  browser.button(:name, 'btnG').click

Maybe the only option I have is to use EC2, but it's a pitty because I
only need to scrap javascript-generated HTML and I want to keep on
using heroku, I love it!!!

What do you think is the best gem for me to do it on heroku? Or
there's no option and I have to use EC2 just to open a browser, losing
the heroku goodness?

Thanks in advance

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.

Reply via email to