Hi everyone,
I've had a bit of time today to gather my thoughts on this import and I hope I can offer something more productive to the discussion now. First, I want to apologize to the importers for the panicked tone of my initial email and private communications. I saw after a long day that the buildings were literally one task away from completely swamping my own neighborhood, and I hope it's understandable that I felt pretty defensive about it, having put so much time into my own little corner of the city over the years. So, I want to thank you all for taking that in stride, and especially for agreeing to stop the import while we discuss the issues I raised. If I came off as harsh or unappreciative, please be sure that I didn't mean to. We're all volunteers here and I know first-hand how much work goes into doing something like this. I'm actually one of the lead mappers for a building import in my hometown at the moment - I'm not opposed in any way to imports of buildings if they're done right.

But I've also spent way too much time cleaning up bad import data - whether it's TIGER imports from way back when or more recently the disturbingly sloppy address ranges that showed up last year in Toronto. In my experience, it takes so much less time to get this right in the first pass than it does to clean up the damage months or years later when we realize some mistakes were made or the data could have been handled better.

There have been a lot of responses to some of the specific things I said, so instead of replying inline, let me try to rephrase the big issues as I see them with some of the new perspective and information in mind.

A ) This import, essentially, did not get approval from the imports list. While an email was sent, I think that it was so vague and misdirected (surely with no nefarious intent) that it would be hard or impossible for a casual subscriber to the list to understand the scope of the project. Without having understood the scope of the project, which is utterly huge, the import plan was not given adequate scrutiny. This is evidenced by the relative lack of discussion.

B ) I didn't know this was going on until I saw it happening. While my personal knowledge is obviously not a necessary precondition for successful imports, I do feel it may be a sign that the scale of this effort is wrong for the task at hand. While the technical details and any processing of the data are probably best handled at the national level, since it all comes from the same source and presumably has the same technical hurdles to overcome, I can't imagine that the whole country can be asked whether it wants buildings to be imported or not, or what concerns and requirements would come attached to such an import. There will be so much local variation and I think that just has to happen at a more local level. If that local effort had been made, I'd be surprised if I never heard about it. Rather than attempt to notify all Canadian mappers, would it be too much to ask that this might go province by province or city by city? If I had seem 'Toronto' or 'Ontario' anywhere on this mailing list, you can be sure my ears would have pricked up right quick.

C ) This import is going way too fast - there is simply no way three people could have carefully imported as much data as has been imported in the time since this started. Like I said, I'm working on an import myself and it's long, tedious, and strangely satisfying work when you're doing it carefully. In my opinion, these task squares are simply ten times too large at least. When I said above that my neighbrhood would be swamped by the next task, I really mean swamped. 90% of the places I go in Toronto fit inside a single task. The tasking manager we're using for the building import in Hamilton County allows one to upload custom task geometries. I got a bit silly with the task shapes perhaps (https://tasks.openstreetmap.us/project/107) but I think the size is about right - importing 500-1000 building footprints should take ~10-30 minutes, with a careful check of the imagery, a check with JOSM's validation tool, a second validation after native OSM data has been merged with the import data... I would never attempt a task as large as the smallest task here, and I do not think that reflects poorly on my abilities or experience. If the tasking manager doesn't allow smaller tasks then it is the wrong tool for the job.

I have several specific technical issues with / questions about the data that are probably best addressed in some other forum, like on the wiki. If I may, I'd like to save those for the moment, because I think I see a productive way to keep moving forward with things while we discuss.

The data needs to be carefully and thoroughly validated at some point, right? May I suggest that everyone stop importing new data and engage themselves in cleaning and validating the data that has already been brought in, neighborhood by neighborhood? There is plenty to keep us all busy for weeks. While doing that, let's make a list of issues that we come across and discuss ways that they can be addressed before any new buildings are brought in. We can take this as a learning experience and make the rest of this import process better.

I have the feeling that some will feel this is redundant - wasn't the Ottawa import the test run? My response has to be that the data and the process are not yet as good as then can and should be, so another round of trials and iterative improvement is needed before this rolls out a mari usque ad mare.

With all due respect, patience, and humility,

Nate Wessel
Jack of all trades, Master of Geography, PhD candidate in Urban Planning
NateWessel.com <http://natewessel.com>

On 1/17/19 3:13 PM, OSM Volunteer stevea wrote:
Thank you, John.

On Jan 17, 2019, at 11:22 AM, john whelan <[email protected]> wrote:
First if you look at the 2020 wiki page history you'll see there is a lot of 
input from Steve.  My concern with this very detailed input is it made it hard 
for a new person to quickly locate relevant information, an overview if you 
like.
I encourage an "Overview" section or what some call a "Quick Start."  For some 
(experienced OSM mappers), this could suffice for "jumping in right now."  However, there is no 
shortcut for anybody involved in the importation of these data to read every single word of the wiki.  If 
wiki words aren't relevant, they either weren't in the right wiki or they could have and should have been 
deleted.  As I wasn't sure of the actual direction of the project, I added what I thought would help.  I 
would much rather have there be more (extraneous, even) guidance and instruction which later got deleted as 
superfluous than not enough and leave volunteers with more questions than answers.  Call this a failure to 
edit the wiki properly, though not on my part.

I will confess that there have been small groups in face to face meetings in 
small cafes where you need a password to logon to the internet.  He was not 
specifically invited to them all.

I confess we have used conference calls and other methods of communication 
without notifying hundreds of people first.  There have even been meetings that 
I was unaware of.  For example I haven't even communicated directly with the 
mappers who are doing most of the import at the moment.

There has even been at least one mapathon that Stats Canada only found out 
about after the event.
I believe what is being said or conveyed here is that decentralized discussion preceding data input "happens."  Sure, 
it does, that is part of a planning process and not all of these are "widely open to all of OSM," nor should they be, 
nor must they be.  So, largely, "we agree" though I'm puzzled at your use of the verb "confess."  Largely 
speaking, it is the degree to which openness happens in OSM (or the spirit of moving it in that direction, especially when 
identified as "we need more here") which is important, not specific cases where openness didn't happen.

Personally I'm not convinced that OpenStreetMap really needs every building in 
the planet mapped in detail.
I don't wish to change your mind, but as you point out later, others seem to 
disagree with you, seeing the urgency with which these data enter OSM.

The history was I was after the bus stops in Ottawa which meant I needed them 
with an open data license we could use.  I used to work at Stats Canada and the 
corporate culture is very different to OSM.
Understandable and nothing wrong with that, especially as OSM does not seek to 
house our data with Stats Canada.  However, the reverse...we know the story.

In Canada we have fewer mappers on the ground and more places to map than in 
many parts of Europe.  We have a history of importing CANVEC data which comes 
from a number of sources including Municipalities.  So I acted in a 
coordinating role.  We managed to persuade the City of Ottawa to change it's 
open data license to align with the federal one.  I got my bus stops.  The 
local mappers were very much involved and there were at least half a dozen face 
to face meetings that took place.  I drifted down to one of them.

Stats was very pleased with the added tags on the building outlines in Ottawa. 
This is information they felt could not be easily obtained in any other way.
Informative and appreciated.  There are "pockets of uniqueness" all over the world and hence 
methodologies of "this is a good match here" for data entering OSM which will and do widely differ 
around the world.  However, I believe all can agree that "quality data are quality data" (as well 
as the opposite) and for this fundamental reason, OSM has standards to follow.

I am very aware that this data is important to many.  This includes Federal 
government departments and agencies.  They were very vocal at a meeting at 
Stats Canada during the HOT summit in Ottawa.  It was open and at least half a 
dozen OpenStreetMappers were present, three or four were from European or other 
out of town locations.  Having the building data in one place makes it much 
easier for the ed users than having to handle different formats and open data 
licenses.  Currently one municipal social agency is very interested in mapping 
places where fresh food can be obtained.  I forget some of the other interests 
but they were quite legitimate.  We have seen considerable interest by high 
schools and students in OpenStreetMap and using streetcomplete with building 
outlines is one way that they can add value without causing too much havoc.
These are precisely the sort of reasons why OSM (with high quality, usable, local data) is so 
important.  Nobody disagrees with "high value data provide high value solutions" as an 
equation that many use.  The "front end" of that, how the data enter, is obviously key 
here.

After we imported Ottawa a group of mappers decided that we needed more 
buildings.  They organised mapathons with new mappers and mapped buildings with 
iD.  The results were not good and the data quality side was raised in talk-ca. 
 I was involved in one where I set up new mappers with JOSM and the 
buildings_tool plugin and that went much better as far as accuracy was 
concerned.
Indeed, this is a typical "use case" in OSM:  a feedback loop says "not good results," so 
improvements to process hopefully assure the next iteration yield better data/results.  Congratulations on 
those successes, they are more of the good stuff of which OSM is made.  "The journey is the reward" 
is part of what's important in the process.  Although, good data as a result is important, too.

The result of these mapathons and the community reaction was to convince Stats 
Canada that releasing more building outlines as was done in Ottawa under an 
Open Data license was a way forward.  Kingston in particular was keen to 
release its building outlines and get them into OpenStreetMap.  Obtaining them 
and making them available was a Stats Canada decision and was made in their 
time frame.
But, was it made within OSM's OWN tenets and timeframes?  That's a crucial consideration 
I continue to feel receives short-shrift (as you seem in the mood to "confess").

Given that Stats Canada released the data under an acceptable Open Data license 
I thought and still think the best way forward was to set up a plan and a 
process to import the data.  The alternative was probably going to be Ad-Hoc 
importing.
I, too, think (and OSM knows) that the best way forward (with importable data) is to set 
up a plan with process.  I thought we did so with the BC2020 "reboot."  Yet, it 
isn't working, or is only partially working with limited success (I'll look at that 
portion in the glass that partially fills it rather than calling it empty when it isn't). 
 So, yet again, let's do a mid-course (or perhaps early-course) correction and right the 
ship.  Really, we seem to largely agree!

I suspect that talk-ca is probably the most appropriate mailing list for this 
sort of discussion which is why I emailed Nate directly.
We can move this to talk-ca if you like, I'm OK with that.

Thanks for continuing good dialog,
SteveA

_______________________________________________
Imports mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/imports

Reply via email to