-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Earl wrote:
> I have been wondering about postcodes. We have a postal_code tag which can
> be applied to streets and it would be nice to collect these. However it is
> not something like name plates that you find in the street by observation.
> There are about 2 million postcodes in the UK, so gathering them manually
> via the freethepostcode project is hard. But using data from most other
> places is subject to copyright.
> 
> But what about this as a 90% solution:
> 
> (a) generate a list of potential postcodes (there are just under 50 million
> patterns of the form "[A-Z]{1,2}[0-9]{1,2} [0-9][A-Z]{2}". Unlike countries
> with numerical codes, these are quite distinctive and amenable to pattern
> matching.

There's a few that go outside that pattern and have an extra letter on
the beginning of the first half. SW1A 1AA is the postcode of Buckingham
Palace, W1A 1AA is BBC Broadcasting House. SW1W also exists as a prefix.
I expect there are more. A corrected pattern is:
"[A-Z]{1,2}[0-9][0-9A-Z]{0,1} [0-9][A-Z]{2}"

There's also a few extra weird cases like GIR 0AA for Girobank, one for
Gibraltar, and some for other UK dependent islands, but they probably
don't matter because they aren't connected directly to streets.

I'm not sure why we have to generate and store the whole list. That just
seems like a huge waste of time and disc space. Surely we only need to
store postcodes we know something about. Allocate searches by whole
prefix, not by

> (b) enumerate streets in the UK from OSM(1) and determine what place they
> are "near" in the UK(2) (e.g. gives us "Hinton Road, Fulbourn" among many
> others)
> 
> (c) do an automated web search on the street. The hits will nearly always(3)
> contain a result which includes one or more addresses in the summary (no
> need to go further). Do a pattern match which is restrictive enough to
> determine the postcode(s) for the address in the sought street, but general
> enough to cope with some variability (punctuation, skipping suburbs and
> counties in the address and so on)
> 
> (d) look up the pattern-matched code from (c) in our table from (a) and fill
> in against it the lat/lon derived from (b). Take the postcode from our table
> and apply it to the postal_code tag in OSM for the street.

> There is a danger that repeated automatic web searches might be autodetected
> by the G company or whoever and one's IP blocked as a result. However it
> could be done over a longish period of time (a month, say, so the search
> rate is low and spread across many volunteers. And as new streets appear
> they can be added incrementally - a much lower search volume.

Use Google's official API. This has a limit on search volume. Write the
scripts to match that volume.

Something I just tried was searching for a postcode prefix - e.g. OX1.
This returned several results with second half postcodes in, some with
addresses, some without. This may be a way to find lots of postcodes.

Legally, IANAL. I know that the post office won't like it, and will
probably tell you that it is illegal. That doesn't mean they are right.

Robert (Jamie) Munro

P.s. I have used roughly the above method manually to search for some
places near where I live, and add them to NPEmaps.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGWE3Kz+aYVHdncI0RAjIYAJ4mV50s6v9UBFcbEotmpnClOvT9aQCfROtQ
tj3JvFLlruUvUm1pfpwd6E8=
=I1MA
-----END PGP SIGNATURE-----

_______________________________________________
Talk-GB mailing list
[email protected]
http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/talk-gb

Reply via email to