I don't know that there are two many rules about this, but here's what comes to
mind for me:
1. respect robots.txt
2. cache content so you don't hit their site more often than is reasonable.
(i'd say that once a day is pretty reasonable)
3. also cache or mockup or something when you're writing
On 10/2/2011 10:23 PM, Nate Hill wrote:
A question: what are the 'rules' around screen scraping?
If one site doesn't offer an RSS feed and you want to grab (for example)
their weekly top ten list with a script and then redisplay it on another
site, is that bad form? Or even illegal?
If the
I think what I'm hearing here is that it would be a good idea to ask a
webmaster on the other end if it's OK.
Advertising... Roberto, good point I hadn't thought of that. Thanks.
On Sun, Oct 2, 2011 at 7:46 PM, Roberto Hoyle rjho...@gmail.com wrote:
On 10/2/2011 10:23 PM, Nate Hill wrote:
A
I don’t know how well this applies to your specific use of screen-scraping, but
for libraries’ broader use of crawlers to build archives, the Section 108 Study
Group Recommendations are a good source of guidance (though not law). They
propose specific copyright exceptions for libraries in