I have a screen scraping library called Scraper that just needs a little cleanup and release.. I just haven't found a chance to do it yet. I'm not too familiar with WWW::Mechanize, but what I remember is the usage wasn't very clean

here's what Scraper looks like

# a session tracks cookies so you can login to web based applications that don't use Basic
  s = Scraper::Session.new

  # use the session to grab a url
  r = s.get "http://www.google.com/";

  # fill out a form
  form = r.form
  form[:q] = "ruby scraper"

  # submit the form and get the resulting page
  results = form.submit

# browse the web by following a couple links, and show the resulting page
  puts results.links.last.follow.links.last.follow.html


Anyone interested?




On Oct 6, 2005, at 9:36 AM, John Labovitz wrote:

Also, if anyone knows of alternatives to Waitr for cross-platform
browsers, let me know.


It's not exactly the same thing, but WWW::Mechanize is pretty good for scraping and controlling websites. I've had a bunch of experience using it to implement front-ends and data-suckers for a client.

Unfortunately, WWW::Mechanize doesn't have very good docs or support. It's one of those back-burner ideas of mine to write up a how-to article.

You can get it as a gem -- "mechanize".

The homepage is currently busted, but is usually at http:// www.ntecs.de/blog/Blog/WWW-Mechanize.rdoc

--John
_______________________________________________
PDXRuby mailing list
[email protected]
IRC: #pdx.rb on irc.freenode.net
http://lists.pdxruby.org/mailman/listinfo/pdxruby


_______________________________________________
PDXRuby mailing list
[email protected]
IRC: #pdx.rb on irc.freenode.net
http://lists.pdxruby.org/mailman/listinfo/pdxruby

Reply via email to