Andy Robinson wrote: > David Earl wrote: >>On 01/12/2008 14:11, Brian Quinion wrote: >>> Has anyone got any suggestions, or is willing to offer any data? Even >>> personal address books would be useful for testing... >> >>You know all the 2,500 or so prefixes, and there are only 26 x 26 * 100 >>combinations for the second part for each - about 200 million in all. If >>you feed these potential postcodes in quotes into Google UK over a long >>period with appropriate pauses so as not to get locked out, and look at >>the result for recognizable addresses (that's the tricky bit) as I'm >>doing in the Namefinder, you'd probably cover 75% of UK postcodes. >> > I'm curious about this. Data scraped via Google is still subject to the > terms of the original page it references?
I looked into this and came to the conclusion that you could probably claim 'fair use' as long as you pulled each address from a different website. The trouble is that for most searches you end up on one of a small number of directory sites so doing any significant number is likely to end up as a database extraction. The results are also mostly limited to business addresses. Probably it would be possible to filter it so not too many requests went to any one site, but that still leaves the possibility that they used royal mails postcode finder (or similar) to find their original data. Across a large number of sites you could end up doing a database extraction from royal mail regardless. Address books and company mailing lists seemed like a preferable source and as long as individuals names are not included privacy shouldn't be an issue. -- Brian _______________________________________________ Talk-GB mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-gb

