On 4 January 2011 09:11, Mark Goodge <[email protected]> wrote: > > There's no such thing as a website which can't be scraped, given sufficient > time and resources. If Bailii is rate-limiting by IP address (which is the > most common form of anti-scraping protection) then that can easily be > circumvented by doing it slowly enough from enough different locations. But,
Bailii limits scraping via robots.txt and an express denial of any licence to do so on its website's terms and conditions: http://www.bailii.org/bailii/copyright.html http://www.bailii.org/robots.txt So, scraping the site via a robot is likely to be a section 1 Computer Misuse Act offence. > from a practical perspective, I would be wary of the dangers of infringing > the intellectual property of an organisation which has counts a number of > law firms among its sponsors :-) Yes. Much of the legal professions (misguidedly, but its the best there is...). I doubt that anything Nick does could possibly be an infringement of any copyright bailii might have (which is going to be minimal at best - in the HTML markup) and probably in any database right (on the basis of the way in which Nick uses the service) though I haven't given it deep thought. -- Francis Davey _______________________________________________ Mailing list [email protected] Archive, settings, or unsubscribe: https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
