I think it might be reasonable to design an automated QA process for OSM that would systematically compare its view of the world with other mapping systems.
The algorithm would roughly be to fetch a random tile from OSM, fetch the corresponding tile from other systems, and have a machine or a human do a comparison test. Depending on what kind of quality you are looking for I can imagine a bunch of tests, the simplest starting with "hot or not", and getting more complex from there. If you had a budget, you could use something like Mechanical Turk to generate the comparisons, and the compare cost is probably measured in pennies per test. At a dollar per tile you should be able to make a serious dent in assessment of the tile quality. If you were clever enough about deciding which tiles to look at first, you might be able to do a 1% sample that gave you a good enough sense for how well you were doing for whatever geography you were looking to do. If you restricted yourself to comparisons within OSM and didn't look at some other external mapping program to compare quality against, it could be equally good to pick a better/worse metric for two tiles and rank order quality within OSM. I don't think this is different from any other kind of statistical quality control, and there may be simple and computationally inexpensive tests that you can design to benchmark against "known good" tiles and "known bad" tiles. thanks Ed not offering to write such code, just suggesting the architecture On Sat, Feb 13, 2010 at 4:43 PM, Stefan Keller <[email protected]> wrote: >> I think you have to let go of the notion of QC as it stands, basically. > > Don't understand this. OSM can and has to be compared to quality tests > as every other "product". > >> Second, *you* are the QC. > > That's an interesting new and obvious approach to QC. OpenStreetBugs > does such crowdsourced Quality Assurance (QA). > > But AFAIK there is no indication of coverage and completeness (which > are important parts of QA) whatsoever in OSM. To me that's an issue! > > What about a webapplication where users can indicate coverage and > completeness of OSM (say in tiles of 1 km2)? > > -S. > > 2010/2/7 SteveC <[email protected]>: >> I think you have to let go of the notion of QC as it stands, basically. >> >> First, let's not pretend that traditional data suppliers are particularly >> good quality anyway, and in fact introduce bugs on purpose in their maps to >> trap copyright infringers. So we can aim higher than that. >> >> Second, *you* are the QC. I'm on a plane and can't look at your link, but >> you can fix it. You can directly fix it in OSM, and you can email Google and >> cross your fingers with more confidence than mailing TA/NT and just sort of >> hoping they might fix it. >> >> Last, I'd say "look at wikipedia, it's fine" and worry about something more >> important. It's like worrying about ontologies or standards... it's just a >> time sink. >> >> Yours &c. >> >> Steve >> >> >> On Feb 2, 2010, at 8:29 AM, Brian Russo wrote: >> >>> Sorta, yeah. >>> >>> I'm mostly curious if anyone else thinks data quality in "Web 2.0" >>> foundational datasets like Google Maps matters. So I suppose yes, some sort >>> of town hall debate over it. I haven't really seen much discussion on it. >>> >>> As I said, everyone has QC issues - not just Google. It just happens to be >>> that Google has decided to go out and build their own dataset - unlike >>> Bing, Yahoo, etc. I agree Google has done much to improve openness of data, >>> however they also chose to make their new dataset closed. This doesn't >>> shock me, but giving out free read access doesn't make it open data - it >>> just means that selling data isn't important to their business model. >>> >>> Also, having seen how many "non-geo" people utilize maps, I find that many >>> of them doubt themselves rather than the map - they assume they're lost, >>> are misreading it, or the GPS has put them in the wrong spot, etc. I seldom >>> see people decide the map is wrong - but this is just my personal, >>> anecdotal experience. >>> >>> Another aspect is the impact of fragmented basemaps - different users with >>> different devices seeing a different view of the world. Overlays are the >>> bread and butter of mapping, and the basemap is often ignored. >>> - bri >>> >>> P.S. Mike Dobson's blog - http://blog.telemapics.com/ >>> >>> >>> >>> On Mon, Feb 1, 2010 at 10:00 PM, Ian White <[email protected]> wrote: >>> not sure what you are going for--a sort of town hall debate over the merits >>> of google's decision? it's easy to forget that parcel data was rarely even >>> visible on a public website until several months ago. so i'll applaud >>> google for getting title search/parcel aggregators scared to hell. no doubt >>> their decision entire methodical and for sure had been several years in the >>> works. i'm certain it's a matter of months (not years) before they drop >>> major provider for business listing data and will go it their own with the >>> small business center. it's pretty clear that consumers don't mind >>> sacrificing quality for price. android turn by turn on VZN is case in >>> point. price trumps quality when it comes to consumer markets. mike dobson >>> has written some very insightful things on his blog about google's mapbase. >>> >>> no question people are perplexed (vexed, even?) by the seemingly >>> unnecessary open map smackdown b/w map maker and OSM. if the geo response >>> to haiti is any indication, we can expect google to seed more coverage a la >>> AND to leapfrog OSM. >>> >>> but this is why everybody should come to where2 this march/april and attend >>> the panel i am moderating "Base Map Smackdown" with head of TIGER, head of >>> product for OS, our own SteveC and hopefully another participant (uh-hum, >>> you know who you are, please respond to me offpost!) >>> >>> i >>> >>>> >>>> Ian White :: Urban Mapping Inc >>>> 690 Fifth Street Suite 200 :: San Francisco CA 94107 >>>> T.415.946.8170 :: F.866.385.8266 :: urbanmapping.com/blog >>> >>> On Feb 1, 2010, at 11:12 PM, Brian Russo wrote: >>> >>>> So as many of you know Google dumped TeleAtlas last October in favour of >>>> home-grown data. Personally I found this choice over leveraging >>>> OpenStreetMap a poor one, but that's another topic. >>>> >>>> Point is that since October, Google Maps' data quality has been very >>>> spotty. From acceptable results to the truly mythic; there's just no way >>>> to know anymore what to expect. This isn't just some academic exercise >>>> anymore as Android hits more mobile phones and more "ordinary" people take >>>> for granted routing & geocoding. Personally I've witnessed this firsthand >>>> on numerous occasions. Friends that nearly missed flights due to bad >>>> directions. Wasting half an hour lost because Google Maps (and Bing and >>>> OSM and Yahoo) had no knowledge of an entire subdivision that's several >>>> years old [1]. I'm sure everyone has anecdotes. >>>> >>>> Really I'm not trying to focus on Google Maps - other providers have this >>>> issue, and the problem exists elsewhere (and certainly is nothing new to >>>> geo data). However the widespread commoditization/adoption of GIS >>>> technology and map data is a done deal and is amplifying this more than >>>> ever before with no "man in the loop" to QC. I think unless consumers >>>> start paying attention then this will develop into a real mess. >>>> >>>> What do you think? Lost cause? Will be overcome by events? >>>> >>>> - bri >>>> >>>> 1. >>>> http://maps.google.com/?ie=UTF8&hq=&hnear=Honolulu,+Hawaii+96822&ll=21.486995,-158.061655&spn=0.007358,0.016512&t=h&z=17 >>>> <ATT00001..txt> >>> >>> >>> _______________________________________________ >>> Geowanking mailing list >>> [email protected] >>> http://geowanking.org/mailman/listinfo/geowanking_geowanking.org >>> >>> >>> _______________________________________________ >>> Geowanking mailing list >>> [email protected] >>> http://geowanking.org/mailman/listinfo/geowanking_geowanking.org >> >> >> _______________________________________________ >> Geowanking mailing list >> [email protected] >> http://geowanking.org/mailman/listinfo/geowanking_geowanking.org >> > > _______________________________________________ > Geowanking mailing list > [email protected] > http://geowanking.org/mailman/listinfo/geowanking_geowanking.org > -- Edward Vielmetti Ann Arbor, MI 48104 Google Voice: +1 734 330 2465 Web: http://vielmetti.typepad.com _______________________________________________ Geowanking mailing list [email protected] http://geowanking.org/mailman/listinfo/geowanking_geowanking.org
