Re: [CODE4LIB] screen scraping

2011-10-02 Thread Ken Irwin
I don't know that there are two many rules about this, but here's what comes to mind for me: 1. respect robots.txt 2. cache content so you don't hit their site more often than is reasonable. (i'd say that once a day is pretty reasonable) 3. also cache or mockup or something when you're writing

Re: [CODE4LIB] screen scraping

2011-10-02 Thread Roberto Hoyle
On 10/2/2011 10:23 PM, Nate Hill wrote: A question: what are the 'rules' around screen scraping? If one site doesn't offer an RSS feed and you want to grab (for example) their weekly top ten list with a script and then redisplay it on another site, is that bad form? Or even illegal? If the

Re: [CODE4LIB] screen scraping

2011-10-02 Thread Nate Hill
I think what I'm hearing here is that it would be a good idea to ask a webmaster on the other end if it's OK. Advertising... Roberto, good point I hadn't thought of that. Thanks. On Sun, Oct 2, 2011 at 7:46 PM, Roberto Hoyle rjho...@gmail.com wrote: On 10/2/2011 10:23 PM, Nate Hill wrote: A

Re: [CODE4LIB] screen scraping

2011-10-02 Thread Tracy Seneca
I don’t know how well this applies to your specific use of screen-scraping, but for libraries’ broader use of crawlers to build archives, the Section 108 Study Group Recommendations are a good source of guidance (though not law). They propose specific copyright exceptions for libraries in