On 12 July 2011 10:02, aupayo <[email protected]> wrote: > Hi, > > I want to screen scrape information from some websites (I have > permission to do it). > > I am using the Mechanize plugin. The websites are different from each > other, so I need to write a different RoR code to screen scrape each > website. There would be hundreds of different websites. > > Ok, the problem is that I don't know how to implement this in an > elegant and efficient way. My current quick and dirty solution is a > model that I call when I want to screen scrape a website: > > I call it like: Spider.crawl(website_id) > > It looks like: > > class Spider < ActiveRecord::Base > > require 'mechanize' > > def crawl(website_id) > > if(website_id == 1) > //Mechanize code for screen scraping website 1 > end > > if(website_id == 2) > //Mechanize code for screen scraping website 2 > end > > ..... > > end > > end > > > How can I improve that? > Is there at least a way to put the code for each website in an > external file, so then I can call just the code I need? That way I > would avoid working with a model that has thousands of lines...
If you just want to split it up then provide a set of models (not based on ActiveRecord), one for each site and call the scrape method from your switch list (which would be better as a case statement). If you derive them all from a common base then you can put any common code in the base. Colin -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

