The result is that folks like myself and others are frustrated by the
  import process, and folks who have good, useful datasets are frstrated
  by the import process.

  [import/mechanical-edit committee proposal]

I agree with your broad sentiments.

Having observed some recent discussion, I think we have two fundamental
problems:

  1) the import guidelines don't adequately describe what is actually
  expected (reasonably so) by the more experienced people

  2) people who want to import are very enthusiastic and often do not
  fully appreciate the difficulty of doing it right and the benefits of
  review and care, and aech new would-be importer needs to have the
  norms communicated to them

I have a concern that while there is wide agreement that imports must be
careful, there is also a view (which I perceive to be a minority view)
that all imports are harmful.  For the committee and "import with care"
effort to be socially successful, I think it has to be separated from
the "do not import at all" view.  I think your note expresses that
separation (or rather, only expresses the view that imports must be done
with care, and I am speculating that you did that on purpose), but I
wanted to mention this explicitly.

I realize my proposals below may come across as strict, but I am
actually in favor of careful imports of high-quality data, when done by
people with a sense of stewardship for the affected area.  (I'm in
Massachusetts, and most of the MassGIS data is very high quality, so
that's my implicit reference point.)  So I am not trying to stop
imports; rather, I think that with more care and especially more delays
for review, we'll get a better outcome in terms of the ratio of map
utiltity to total volunteer time.

My thinking is heavily influenced by the experience of leading a
~20-person software team, with a loose analogy of preparing changes on
branches and then merging to master with approval.  I know imported data
isn't software, but in terms of preparing bits and then changing the
shared code/data base, I think it's quite analagous.

Overall I suggest three concrete steps:

  1) document the actual expectations on the wiki.  Specifically

     a) The conversion process has to be described well enough to be
     considered High Level Design from a software viewpoint so that
     someone else could write the conversion scripts.  This should
     address datum/projection issues.  Most importantly, it should
     address how the import avoids new data that conflicts with old
     data.  The plan should describe which tools will be used to put the
     data in the main database

     b) The actual data to be uploaded (with all pre-upload cleanup
     actually done, not the notion that each file will get manual
     cleanup before uploading) has to be posted for review.

     c) No data can be uploaded until the per-import page has met the
     standards, and the scripts and converted data that will be uploaded
     has been published, and there's been a 14 day review period, which
     is reset by any substantive change in the page or any change in the
     script or data.

     d) (probably) the data should be uploaded to some test server
     (assuming there is one) so that people can see what happens in the
     database and with rendering.  Each person doing uploads should be
     expected to do the test server upload.

     e) Once the two weeks have passed, and there is rough consensus
     that the plan and data are adequate, a small amount of data (but
     bigger than can be examined 100% by hand) can be uploaded.  The
     idea is to have something that is not that big in case there is
     trouble, but for which the process will be representative of the
     rest.  An example would be a single town in Massachusetts, with
     thousands of buildings or address points or hundreds of roads.

     f) After the initial small upload, there is another 14 day review
     period, during which people can find issues with the data.   If
     there are significant issues, the proposal, script and data should
     be fixed, and the 14-day review period in step c starts anew

  2) Add the notion that when people talk about imports, the committee
     contacts them privately and makes sure they really understand point
     1.  Probably also a public note in response, briefer.  Someone from
     the committee should stay in touch about judging when the consensus
     in (e) has happened.  Overall, aside from documenting the norms, I
     see this as the main job of the committee.

  3) For areas where it makes sense, consider sending private messages
     via the web site to registered active mappers in the area.  For
     example, if after the MassGIS buildings import entered the 14-day
     review period (where all concerns had been met), it might make
     sense to message every Mass mapper who has edited in the last 90
     days and point out the wiki page and that it's being discussed on
     talk-us@.


(Everyone knows that this discussion was triggered by the massgis
buildings import.)  I should emphasize that I'm not trying to pick on
Jason here.  I think data was uploaded too soon, and too many towns.
But, I have looked at the map of data that's been imported, and driven
around today with the data on my Nuvi (last night's us-northeast
geofabrik extract - thanks again to Frederik for providing those) plus
the not-yet-imported data for my town and all the towns between it and
Cambridge.  Aside from one glitch in not-yet-imported data (in one
town), everything looked excellent in terms of accuracy.  I would see a
small building on the map, find it odd, and then look at the real world
and in fact it was there, every time.  There were a few houses that are
not in the data (probably due to tree cover).  The only building on the
map that isn't there was one in my town that was torn down one year ago.
I could not tell when I crossed from new data to data from the previous
lidar import.  I did not see a single overlapping/messy building.  So
while there are (entirely fair) process concerns about long enough
review periods (which are NOT documented on the import guidelines
page!), the actual uploaded data looks good to me.  I'm unaware of
anyone pointing out a specific significant issue (or really any issue)
with the uploaded data.

Greg

Attachment: pgp59MryK52jO.pgp
Description: PGP signature

_______________________________________________
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us

Reply via email to