Hi, I would like to use this an opportunity to offer a few general remarks.
It is clear that OSM is no place to import "machine learning" results without thorough checks by people familiar with the situation on the ground. I think that, at least for the moment, everyone agrees with that statement. What we're seeing, though, is the "white-washing" (for lack of a more culturally neutral term... could I say the "community-washing"?) of machine learning or other low quality data, in various ways: * people suggest importing low-quality machine-learning data and promise a manual review and improvements during the import process, but the reality later is that a focus was put on quantity not quality, and individuals "review" tens of thousands of objects a day - this is not what most people would understand to be a review. * people create derived works from the machine-learning data, e.g. aggregate low quality building traces into "residential areas" and then import them, again with quality control on the level of spot checks This often happens because there is no good collaboration platform for fixing errors before the data goes into OSM; the hope is that *after* the data has gone into OSM, "the community" - here used as a nebulous term that often means "other people not us" - can and will fix things. This is a trend that we should be wary about. Looking at the GitHub issue, I do find a few "hopeful" statements there that would raise an alarm for me: "Floating roads - easy to fix" (yes but who does it?), "we usually take care of all those mentioned fixes in our editing process" (from Drishtie - I am unsure what "usually" means and how much time Facebook have agreed to spend on this?), "Yes we can go. May be some problems will be fixed later", and so on. As always, the concrete discussion is made difficult by an existing disaster condition, where anyone who says "wait a minute" feels the pressure of standing in the way of humanitarians saving lives. My respect to Christoph for taking a principled stand here and separating the aspects of "due process" and "humanitarian situation". I'm loathe to oppose the concrete project, but I think we really have to be more strict here if we do not want to become a rubbish dump in the long run. We can't call for Facebook's machine-learning output to be imported everytime there's a natural disaster somewhere. Every import should have a post-import review, where after 6 months or one year, we actually analyze what has happened: * how much has been imported (compared to what was planned), * have the quality checks/controls that were promised/hoped for during the planning actually materialised, or has problematic data been waved through with just spot checks? * has the data been healthily assimilated by a local community working on OSM, or does it just sit there and rot away? Such an "import health check" could then lead to concrete projects to improve it if deemed problematic, or in drastic cases, a decision could be made to remove an import again if it is found out that it was not helpful. Actually, this does not only apply to imports but also to concerted mapping efforts - quote from a post that Pierre Beland made on osm-talk just yesterday: "The number of contributors is limited in Africa and the risk is that errors created by mapathons while participating to Crisis responses stay as is for years." Bye Frederik -- Frederik Ramm ## eMail [email protected] ## N49°00'09" E008°23'33" _______________________________________________ Imports mailing list [email protected] https://lists.openstreetmap.org/listinfo/imports
