[talk-au] Import vs filtering query

Little Maps Sat, 04 Sep 2021 03:54:51 -0700

Hi all, my understanding is that the process described below is a big filtering 
exercise rather than a data import, but since I’ve never been involved in an 
import before, I’d like to check before progressing. Thanks in advance for your 
feedback.


Goal: to update road surface tags across regional Victoria where necessary. 
Many surface tags were added 8-10 years ago and a surprising number of roads 
have been surfaced since then. (I’m only interested in sealed/paved vs 
unsealed/unpaved options, not subsets of these.)

Method: compare road surface data in OSM against data in the Vic government’s 
transport dataset which we have permission and waiver to use. All rural roads 
from motorways to unclassified (not residential, service, etc) that have 
different tags in OSM and the gov dataset will be examined against satellite 
imagery and Mapillary, and any decisions on whether to update the surface tags 
will be made based solely on the imagery. No data will be directly copied from 
the gov dataset. Hence, as I understand osm’s import guidelines, this is a big 
filtering exercise rather than an import. Is that a correct interpretation? 
I’ve added a longer explanation below to help answer any questions.

Basic assumptions: (1) I assume both datasets were made independently, as I’ve 
not seen any evidence that OSM surface tags were copied from the Vic data (or 
that the gov copied from OSM). (2) If the 2 independent datasets both indicate 
the same surface then I assume it is most likely to be correct. If they 
indicate different surfaces then one must be in error. At the outset, I have no 
idea how accurate the Vic gov dataset is, so I’m not assuming it is infallible 
(it’s definitely not; see comment below). 

Methods: for every road segment that has a different surface tag in the 2 
datasets, I’d inspect the road using available imagery, as is normally done 
when adding or updating a surface tag. Existing OSM tags will either be altered 
or retained, as required. There’s no ambiguity involved in updating a tag from 
unpaved to paved. It’s much less common to need to update a tag from paved to 
unpaved. Again, this will be done based on imagery, regardless of what the Vic 
data says. 

Some prelim observations: I’ve trialled the method in NW Vic, where the method 
works fine on longer road segments/ways. The approach would have to be 
restricted to ways > 1-2 km long, and short ways will be ignored. From an 
initial subset of about 50 roads > 5 km long in NW Vic, I found about 2/3 of 
the discrepancies between the 2 datasets did not warrant any change in OSM and 
about 1/3 did. The Vic gov data doesn’t seem to be as up-to-date as the imagery 
and isn’t by any means perfect. Regardless, the approach looks to be a very 
effective way to find out-of-date and inaccurate road surface data across the 
state.

At this stage I don’t know how many ways will be examined or changed, as it 
will depend on the minimal length of road I inspect. I’m envisaging about 1000 
at the max, and probably fewer.

My guess is that, if the process was completed across Vic, then the surface 
data in OSM would be extremely accurate, and more accurate than in the Vic gov 
database. If I get through enough of it without going bonkers, I’m interesting 
in summarising the findings to show which discrepancies were most common, etc.

So, back to the original question, is this process ok to pursue, given that the 
sole function of the gov dataset is to provide a filtering mechanism to 
identify roads to investigate, and all decisions will be made based on legally 
available imagery, not the gov data?

Thanks very much for your feedback, Ian

_______________________________________________
Talk-au mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/talk-au

[talk-au] Import vs filtering query

Reply via email to