Brian Quinion wrote: >Sent: 01 December 2008 4:01 PM >To: Andy Robinson (blackadder-lists) >Cc: David Earl; [email protected] >Subject: Re: [Talk-GB] Request for UK address lists for postcode extraction > >Andy Robinson wrote: >> David Earl wrote: >>>On 01/12/2008 14:11, Brian Quinion wrote: >>>> Has anyone got any suggestions, or is willing to offer any data? Even >>>> personal address books would be useful for testing... >>> >>>You know all the 2,500 or so prefixes, and there are only 26 x 26 * 100 >>>combinations for the second part for each - about 200 million in all. If >>>you feed these potential postcodes in quotes into Google UK over a long >>>period with appropriate pauses so as not to get locked out, and look at >>>the result for recognizable addresses (that's the tricky bit) as I'm >>>doing in the Namefinder, you'd probably cover 75% of UK postcodes. >>> >> I'm curious about this. Data scraped via Google is still subject to the >> terms of the original page it references? > >I looked into this and came to the conclusion that you could probably >claim 'fair use' as long as you pulled each address from a different >website. The trouble is that for most searches you end up on one of a >small number of directory sites so doing any significant number is >likely to end up as a database extraction. The results are also >mostly limited to business addresses. > >Probably it would be possible to filter it so not too many requests >went to any one site, but that still leaves the possibility that they >used royal mails postcode finder (or similar) to find their original >data. Across a large number of sites you could end up doing a >database extraction from royal mail regardless. > >Address books and company mailing lists seemed like a preferable >source and as long as individuals names are not included privacy >shouldn't be an issue. >
I'd noted that too. Business directory listings (Yell, Thomson etc) or house price finders which are using copyright Land Registry data in the background. One source I am exploring is planning application listings produced by the local authority. Which is I think were you had headed? Cheers Andy _______________________________________________ Talk-GB mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-gb

