Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-30 Thread Maarten Deen

On 2017-03-30 09:36, Alex K wrote:


@Jo: Wow, that’s impressive! How did the community manage to tag
every house?!?


In the Netherlands the basic cadastre information regarding shapes and 
addresses of buildings is freely available and has been imported into 
OSM and is maintained on ad hoc basis.

https://bagviewer.kadaster.nl/lvbag/bag-viewer/index.html

Maarten

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-30 Thread Alex K
Hi all,

@Frederik: Nominatim isn’t the only reverse geocoder and it has it’s problems, e.g. the missing tolerance for typos/whitespaces caused me to build a completely separate reverse geocoding tool for a particular use case I was working on. So adding something just to Nominatim fixes the problem at the wrong end because then only Nominatim would benefit from it ….

@Tom: Agreed, there are some countries where this probably doesn’t make sense. I did some research on Austria and their postal code system is based on the numbers of the postal distribution hubs and in the majority of cases the postal code regions form “closest hub”-cells (travel distance, not as the crow flies). Switzerland also seems to work similarly. That’s one of the reasons why I didn’t want to use a fully automatic procedure but this “generate and edit” approach. If the shapes are weird, it shows that the approach is wrong for that country.

@Jo: That’s what I thought at first as well. But when I did manual editing and research, I found that there are more cases then not where postal code boundaries actually do not align with admin boundaries. Sometimes it’s a question of “this small village belongs to the other postal code”, sometimes it’s that 3 admin boundaries actually cover 4 postal code regions, etc...
@Jo: Wow, that’s impressive! How did the community manage to tag every house?!? I believe you are right: if every house has proper postal code information, than all the use cases I can think of would work without having explicit postal code shapes. Unfortunately, for Germany, Austria, Switzerland and Polen, we’re far from having even the majority of houses covered… also, as far as I understand it, postal codes were invented to be districts of postal distribution to optimise delivery, so for most countries they should form sensible regions.

@Colin: Definitely! What I currently do is use a list of “valid” postal codes for filtering out such non-addressable post boxes. I also use that list and produce a statistic so I can double-check that all post codes are represented in my data point set and figure out which postal codes would need more data points to better define the shape.

Alex

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Colin Smale
No need to pick nits Martin, you know what I mean. Any lat/lon
associated with a non-geographic postcode is arbitrary and volatile. It
is not needed for delivery purposes - only the code itself is used. It
can be "moved" to some other location, possibly a long distance away, in
a way that is not possible with a normal postcode. In the same way as a
mobile or service (0800 etc) phone number might have a "lat/lon"
representing the current home address of its "user", but that is not the
same as the lat/lon of a fixed line. 

On 2017-03-29 13:32, Martin Koppenhoefer wrote:

> 2017-03-29 12:14 GMT+02:00 Colin Smale :
> 
>> There are also in some countries "non-geographic postal codes" - things like 
>> reply numbers and PO Boxes.
> 
> they are geographic as well, just that their place is at the post box and not 
> at the owner of that box... ;-)
> 
> Cheers, 
> Martin___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Martin Koppenhoefer
2017-03-29 12:14 GMT+02:00 Colin Smale :

> There are also in some countries "non-geographic postal codes" - things
> like reply numbers and PO Boxes.



they are geographic as well, just that their place is at the post box and
not at the owner of that box... ;-)

Cheers,
Martin
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Colin Smale
There are also in some countries "non-geographic postal codes" - things
like reply numbers and PO Boxes. If we are able to filter these out of
the data in some way it may make the job a little easier. 

Although the UK postcodes are not defined as a boundary but as a list of
points, there are several "reverse geocoding" services in the UK which
implicitly will allow you to find the boundary that their algorithm+data
leads to. I have no idea how "accurate" their results are for a random
point, or indeed, how you would measure that accuracy.

//colin 

On 2017-03-29 10:51, Jo wrote:

> I'm in Belgium, so I'm mostly familiar with postal codes here. There are some 
> oddities, but mostly they are not too illogical and it is possible to draw 
> polygons around/in between them. 
> It seems Alex already did this exercise for Austria and Switzerland, so I 
> think it's possible there as well. He'll probably needs to talk to the 
> mailing lists for each country separately to figure out whether there is 
> willingness to define (initial versions of) these boundaries. 
> 
> Polyglot 
> 
> 2017-03-29 10:46 GMT+02:00 Tom Hughes :
> On 29/03/17 09:41, Jo wrote:
> 
> For postal_code boundaries, they will very often follow existing
> boundaries, except where they don't... so I would say it is possible to
> draw them by mostly following the existing admin_boundaries. So now you 
> appear to be talking about the UK which I do know about and which definitely 
> doesn't have boundaries as such.
> 
> Royal Mail as I understand it defines each post code by a list of addresses. 
> They do also provide a centroid point derived from that list but I don't 
> believe they provide any sort of boundary. 
> 
> Tom
> 
> -- 
> Tom Hughes (t...@compton.nu)
> http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Jo
The Netherlands are a very strange beast concerning postal codes... :-)
Definitely an example where such an approach wouldn't make sense. OTOH, you
have complete address data for every single building in The Netherlands, so
there is also no real need for it (anymore).

Polyglot

2017-03-29 11:07 GMT+02:00 Maarten Deen :

> On 2017-03-29 10:20, Tom Hughes wrote:
>
> Rather they are just defined by a list of addresses, being a set of
>> addresses that are a convenient group to deliver to.
>>
>> Now obviously you can draw any number of shapes around those addresses
>> but none of those shapes is in any way an official or definitive
>> boundary for the postal code.
>>
>
> True, and depending on the level of detail of the postal code this will be
> very artificial boundary in the Netherlands.
> The number part of a dutch postal code would in most cases result into a
> proper boundary, but if you go to the number+letter part, wou will include
> streets in the area that do not belong to that postal code, but will also
> not be in the shape of the proper postal code because there are no houses
> there. The street behind my house will be one such case.
>
> So, you will be able to find the proper postal code beloning to a house,
> but given a certain postal code, you could get directions to the wrong
> location.
>
> Regards,
> Maarten
>
>
>
>
> ___
> dev mailing list
> dev@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
>
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Maarten Deen

On 2017-03-29 10:20, Tom Hughes wrote:


Rather they are just defined by a list of addresses, being a set of
addresses that are a convenient group to deliver to.

Now obviously you can draw any number of shapes around those addresses
but none of those shapes is in any way an official or definitive
boundary for the postal code.


True, and depending on the level of detail of the postal code this will 
be very artificial boundary in the Netherlands.
The number part of a dutch postal code would in most cases result into a 
proper boundary, but if you go to the number+letter part, wou will 
include streets in the area that do not belong to that postal code, but 
will also not be in the shape of the proper postal code because there 
are no houses there. The street behind my house will be one such case.


So, you will be able to find the proper postal code beloning to a house, 
but given a certain postal code, you could get directions to the wrong 
location.


Regards,
Maarten



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Jo
I'm in Belgium, so I'm mostly familiar with postal codes here. There are
some oddities, but mostly they are not too illogical and it is possible to
draw polygons around/in between them.
It seems Alex already did this exercise for Austria and Switzerland, so I
think it's possible there as well. He'll probably needs to talk to the
mailing lists for each country separately to figure out whether there is
willingness to define (initial versions of) these boundaries.

Polyglot

2017-03-29 10:46 GMT+02:00 Tom Hughes :

> On 29/03/17 09:41, Jo wrote:
>
> For postal_code boundaries, they will very often follow existing
>> boundaries, except where they don't... so I would say it is possible to
>> draw them by mostly following the existing admin_boundaries.
>>
>
> So now you appear to be talking about the UK which I do know about and
> which definitely doesn't have boundaries as such.
>
> Royal Mail as I understand it defines each post code by a list of
> addresses. They do also provide a centroid point derived from that list but
> I don't believe they provide any sort of boundary.
>
>
> Tom
>
> --
> Tom Hughes (t...@compton.nu)
> http://compton.nu/
>
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Tom Hughes

On 29/03/17 09:41, Jo wrote:


For postal_code boundaries, they will very often follow existing
boundaries, except where they don't... so I would say it is possible to
draw them by mostly following the existing admin_boundaries.


So now you appear to be talking about the UK which I do know about and 
which definitely doesn't have boundaries as such.


Royal Mail as I understand it defines each post code by a list of 
addresses. They do also provide a centroid point derived from that list 
but I don't believe they provide any sort of boundary.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Jo
What I did, at some point when I was trying to figure out zones of bus
stops, was to create a MapCSS style which gave me different background
colours for each numbered zone. This helped me to visually find the
'outliers', based on the ones that I did already figured out the zone for.

For postal_code boundaries, they will very often follow existing
boundaries, except where they don't... so I would say it is possible to
draw them by mostly following the existing admin_boundaries. If we ever
want to draw parishes, we'll probably have to 'wing it' in a comparable
way. To me this feels like how we started drawing everything all the way
back when OpenStreetMap was a blank canvas. We start with something, if
it's wrong we correct it and slowly but surely we get to a point where we
have the best data. And possibly OSM becomes the only place where those
shapes can be found. It's unlikely they are exactly what the post offices
use, but it's even more unlikely that the post offices will ever share them
in a way we can use them. And if all the houses with that post code are
within them, they are 'good enough'.

Saying that you should contribute them to Nominatim might be 'the right
thing to do', but then we lose the possibility to improve them
incrementally, which is exactly what OpenStreetMap excels in.

Polyglot

2017-03-29 10:20 GMT+02:00 Tom Hughes :

> On 29/03/17 08:58, Frederik Ramm wrote:
>
> On 29.03.2017 09:10, Alex K wrote:
>>
>>>   * For one, this type of information is already part of
>>> OSM: http://wiki.openstreetmap.org/wiki/Tag:boundary%3Dpostal_code
>>>
>>
>> Generally we don't like data that is impossible (or difficult e.g.
>> "knocking on doors") to verify on the ground, but we do make exceptions
>> for admin boundaries and, usually, postal code boundaries.
>>
>
> If the postal code boundary is a genuine thing that exists then sure.
>
> I don't know about the countries in question, but in the countries that I
> do know about there is no such thing as a postal code boundary because the
> authority that assigns postal codes doesn't do so using geographic areas
> like that, and doesn't special any specific boundary.
>
> Rather they are just defined by a list of addresses, being a set of
> addresses that are a convenient group to deliver to.
>
> Now obviously you can draw any number of shapes around those addresses but
> none of those shapes is in any way an official or definitive boundary for
> the postal code.
>
> Tom
>
> --
> Tom Hughes (t...@compton.nu)
> http://compton.nu/
>
> ___
> dev mailing list
> dev@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
>
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Tom Hughes

On 29/03/17 08:58, Frederik Ramm wrote:


On 29.03.2017 09:10, Alex K wrote:

  * For one, this type of information is already part of
OSM: http://wiki.openstreetmap.org/wiki/Tag:boundary%3Dpostal_code


Generally we don't like data that is impossible (or difficult e.g.
"knocking on doors") to verify on the ground, but we do make exceptions
for admin boundaries and, usually, postal code boundaries.


If the postal code boundary is a genuine thing that exists then sure.

I don't know about the countries in question, but in the countries that 
I do know about there is no such thing as a postal code boundary because 
the authority that assigns postal codes doesn't do so using geographic 
areas like that, and doesn't special any specific boundary.


Rather they are just defined by a list of addresses, being a set of 
addresses that are a convenient group to deliver to.


Now obviously you can draw any number of shapes around those addresses 
but none of those shapes is in any way an official or definitive 
boundary for the postal code.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Frederik Ramm
Hi,

On 29.03.2017 09:10, Alex K wrote:
>   * For one, this type of information is already part of
> OSM: http://wiki.openstreetmap.org/wiki/Tag:boundary%3Dpostal_code

Generally we don't like data that is impossible (or difficult e.g.
"knocking on doors") to verify on the ground, but we do make exceptions
for admin boundaries and, usually, postal code boundaries.

But that exception would certainly not apply to derived data; if it is
desired to use derived data in geocoding, then the code to run the
derivation must be in Nominatim (so that any derived geometries
automatically update when base data is modified).

>   * The current postal code tagging of admin levels in
> Austria/Switzerland is not only of bad quality but also wrong from a
> logical aspect. The boundaries of the two have no connection in real
> life. We should get rid of THAT information because it produces
> false results in Nominatim.

You are welcome to suggest the deletion of this information on the
mailing lists/forums in Austria and Switzerland, and remove them if the
community agrees.

> So the information has high practicle value and can help push OSM into
> new areas of usage. That's why I believe it is very important to add it
> for more countries...

You can add code that does sophisticated post code guessing to Nominatim
but you cannot add the result of a sophisticated guessing algorithm as a
base geometry in OSM - even less so if the algorithm you used for
guessing isn't available for inspection.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-29 Thread Alex K
Hi Tom,

I see your point. Hope you don't mind if I make a quick counter argument and try to convice you:


	For one, this type of information is already part of OSM: http://wiki.openstreetmap.org/wiki/Tag:boundary%3Dpostal_code
	According to my evaluations, the postal code relation coverage in Germany is near perfect (less than 1% error according to my evaluations, depending on what you measure) without (as far as I know) there being publically available information.
	The current postal code tagging of admin levels in Austria/Switzerland is not only of bad quality but also wrong from a logical aspect. The boundaries of the two have no connection in real life. We should get rid of THAT information because it produces false results in Nominatim.
	For most countries, there seem to be no official, publically available data 
	You can verify it if you a) have sets of addresses (e.g. customer databases) that are so large that incorrect entries can be filtered out as noise or b) drive along the boundary, knock on doors and ask people what their postal address is (no serious suggestion of course, but technically not much worse then when people walk to each individual house to map the house number)


It's actually not "invented" but rather (imperfect) "derived" data, I think that's an important distinction. By having a high quality initial reconstruction, it makes it easy for others in the community to find errors in individual boundary lines and produce even better postal code areas. This information helped me detect/correct incorrect tagging on individual houses because of the inconsistency between postal code areas and the invidiual node/way. Vice versa, each new house that is tagged with a postal code will make it possible to detect inconsistencies and improve the postal code relation. Add to that the improvement to Novinatim if we would have that information in OSM.

So the information has high practicle value and can help push OSM into new areas of usage. That's why I believe it is very important to add it for more countries...

Alex

 



Tom Hughes wrote on 29.03.2017 00:00:


On 28/03/17 22:24, Alex K wrote:

> Basically I'm using a semi-automatic process which takes all the know
> data points (e.g. buildings/nodes with an explicit postal code tagged to
> them) in OSM, generate voronoi cells and then merge them to larger
> regions. Then I do manually editing and clean up in my tool (aligning
> boundaries more nicely, reducing number of nodes, etc) and finally want
> to import the results as new relations into OSM. I did some evaluations
> on Austria against a real life data set and although the computed
> regions can only be an approximation, it did improve the quality of
> answers a lot! I did some basic write up of the details plus screenshots
> here:

Invented non-real world data like this simply doesn't belong in 
OpenStreetMap I'm afraid.

I mean feel free to generate these areas and make them available for 
geocoding etc but they're not real things and they clearly can't be 
verified by anybody because they don't actually exist.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://compton.nu/




___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-28 Thread Frederik Ramm
Hi,

   interesting work. Maybe there's a way you can automate that and offer
it as a module for Nominatim so people who would like to use "guesswork
postcodes" as a better-than-nothing alternative could activate that in
Nominatim.

Similar things have been done before e.g. for the UK
https://github.com/chfw/pc2shape and
http://random.dev.openstreetmap.org/postcodes/.

Importing your "guesswork postcodes" to OSM is not possible I'm afraid.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-28 Thread Simon Poole
Speaking on the Swiss situation, there is an open access postal area
dataset available on a wonky licence, depending on what you want to use
it form it may be OK. See
https://opendata.swiss/en/dataset/amtliches-ortschaftenverzeichnis-mit-postleitzahl-und-perimeter1

Originally at least it was generated by examining the postal delivery
routes, now days it lives a senseless and expensive life of its own.
Note that it is updated on a monthly base and tends to have a large
number of changes every time.

As to address data in OSM: there is no real agreement if we are tagging
real postal addresses including postal cities or not, postal cities tend
to be difficult to survey and not really accessible for non-locals, so
tend to end with a mix of admin entities and postal whatevers in addr:city.

Simon



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Programatic reconstruction of postal code areas

2017-03-28 Thread Tom Hughes

On 28/03/17 22:24, Alex K wrote:


Basically I'm using a semi-automatic process which takes all the know
data points (e.g. buildings/nodes with an explicit postal code tagged to
them) in OSM, generate voronoi cells and then merge them to larger
regions. Then I do manually editing and clean up in my tool (aligning
boundaries more nicely, reducing number of nodes, etc) and finally want
to import the results as new relations into OSM. I did some evaluations
on Austria against a real life data set and although the computed
regions can only be an approximation, it did improve the quality of
answers a lot! I did some basic write up of the details plus screenshots
here:


Invented non-real world data like this simply doesn't belong in 
OpenStreetMap I'm afraid.


I mean feel free to generate these areas and make them available for 
geocoding etc but they're not real things and they clearly can't be 
verified by anybody because they don't actually exist.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] Programatic reconstruction of postal code areas

2017-03-28 Thread Alex K
Hi all,

I'm working on programatically generating postal code areas to improve reverse geocoding quality for Switzerland, Austria, Italy and some other European countries where there is no or only insufficent postal code relations. Austria and Switzerland sometimes have postal code tags on their admin level 8 boundaries, but this is incorrect (in most cases postal code area != district area) and especially in large cities there is no information (because this would require multiple postal codes to be mapped to one admin boundary).

Basically I'm using a semi-automatic process which takes all the know data points (e.g. buildings/nodes with an explicit postal code tagged to them) in OSM, generate voronoi cells and then merge them to larger regions. Then I do manually editing and clean up in my tool (aligning boundaries more nicely, reducing number of nodes, etc) and finally want to import the results as new relations into OSM. I did some evaluations on Austria against a real life data set and although the computed regions can only be an approximation, it did improve the quality of answers a lot! I did some basic write up of the details plus screenshots here:

https://metashapes.com/blog/reconstruction-postal-code-areas-using-openstreetmap/

Looking great so far, right now I'm developing/testing my tool on Liechtenstein because it's a smaller test set. There are two things where I hope the community might help:

- Is there any further data source that has points of interest or something similar with postal code information? Using street/postal-code pairs contained in OSM seem to produce a lot of false information, so I'm looking for longitude/latitude/postal-code triples in CH, AU, NL, PL, IT. Right now, for licensing reasons I'm only using information that's already inside OSM and additional data points from OpenGeoDB. Perhaps a list of restaurants or public places could be used to query OSM for their geo-coordinate... I'm mostly interested in CH/AU.
- What's the best way to get the information back into OSM? I guess I could either use a C++-library to directly import it from my tool, export my data to some shape file format and then use some other means to put it into OSM (e.g. through JOSM or some bot). I'm also more than happy to simply contribute the shape information if someone else has more experience with importing such information into OSM! I unfortunately won't be able to share the code of the tool itself...

Any input and/or help is highly welcomed!

Alex



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev