Time for an update on this.

I've done my first proper test run of this today using the latest ONS file.
 The test was run over an area of about 60 square miles in southwest
London.  There's about 3000 postcodes in the result set which I'm pretty
impressed with assuming they are all valid, so we should see a significant
uplift in the number of mapped postcodes once I get the issues ironed out.
 Posting it up here to see if anyone else can spot things that need to be
addressed.

The first file [1] contains all buildings in OSM that already have a
postcode which the script picked up.  This is good for QAing, and it looks
fairly good from my first look over it, but I need to do some proper
analysis.  This file isn't valid OSM xml as it has 2 tags for the postcode,
but useful for analysis anyway, I'll make it a bit better in the next
iteration.

The second [2] is the actual output that would get loaded onto OSM.  I've
noticed two issues myself (see below), and appreciate any input from others
as well.

Issue 1 - Some of the buildings are coming out with multiple postcodes e.g.
way #117697674 maps to SW147NX and SW147PQ.  Appears to be for large ways
and I don't think there's anything that can be done other than to filter
these out.  Very easy to do, looks like it'll remove about 10% from the
result set.

Issue 2 - The second issue will require a bit more work, some ways have
 international characters that are getting garbled at some point during the
transformation as the script isn't handling the encoding correctly.
 Currently looking into it, worst case scenario I'll have to filter these
out somehow.  An example way is for Westmiinster Abbey - 23093437

[1] http://paste.ubuntu.com/5568746/ -  The first postcode tag on the way
is the existing postcode in OSM, second is the one identified by the script
[2] http://paste.ubuntu.com/5568754/ - In the final version I'll be
splitting these output files into sets containing 1000 ways each.


On 21 January 2013 16:13, Aidan McGinley
<[email protected]>wrote:

> @Brian - Yes I need to formulate how to QA this.  I'd like to automate the
> QA as much as possible but having some elements done manually is obviously
> beneficial and the more people that can cast their eye over it the better.
>  Any volunteers please do let me know, and also if anyone has any ideas for
> how to QA this do let me know.
> @Robert - That accuracy check would be very easy to do as part of the QA
> process, I'll add it to my list of To Do items.  I'll be able to give you
> an indication of the number/percentage of postcodes potentially added after
> I do a run against the full postcode file, right now I just don't know as
> I've only been working with a very small subset.  Bear in mind there is
> only 27,013 unique UK postcodes in OSM at present so any import is going to
> be significant in my eyes.  For comparison the number of postcodes in the
> ONS data that matches the criteria I outline above is 1.7M, so even a tiny
> hit rate will result in a significant uplift to the data in OSM.
>
> On 18 January 2013 10:43, Matt Williams <[email protected]> wrote:
>
>> On 17 January 2013 23:01, Rob Nickerson <[email protected]>
>> wrote:
>> > I would imagine that this would add a fair number of postcodes, and
>> although
>> > those interested in address lookup can just use the centroid database
>> > without needing to go to OSM, this requires knowledge of the database
>> (which
>> > non-UK developers might not have) and does not link postcodes back to
>> > address numbers and street names. Also recall that the Auto industry
>> asked
>> > in 2012 how OSM intends to bridge the gap between us and commercial map
>> > providers. Something like this would be a good step in the right
>> direction
>> > in my opinion.
>> >
>> > From what I have heard, this sounds like a very cautious import and I am
>> > happy to support it. It may even have lower "error" rates than some
>> manual
>> > edits!!
>> >
>> > RobJN
>> >
>> > p.s. Matt, if you are reading this, do you still update your graph of
>> number
>> > of postcodes added to OSM? Might be interesting to see it.
>>
>> Sure, the latest version (from the update a few days ago) is attached.
>>
>> The vertical axis represents my interpretation of how many delivery
>> points we have with an address in the UK in OSM at the moment. This
>> means I've expanded out interpolated ways and buildings with multiple
>> addresses.
>>
>> The big straight section in the middle is from where I didn't update
>> the tool for ages.
>>
>> --
>> Matt Williams
>> http://milliams.com
>>
>> _______________________________________________
>> Talk-GB mailing list
>> [email protected]
>> http://lists.openstreetmap.org/listinfo/talk-gb
>>
>>
>
_______________________________________________
Talk-GB mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/talk-gb

Reply via email to