On 2015-08-18 02:13, Warin wrote:
> On 17/08/2015 11:13 PM, Colin Smale wrote: > >> ...which IMHO is part of the bigger picture of data quality. Quality is not >> the same as perfection. It is about agreeing things, complying with what has >> been agreed, the ability to measure the compliance objectively and feedback >> to help improve the compliance. > > ISO 9000 is a standard for quality .. it means if you produce something .. > you will continue to produce that something consistently .. rubbish or not. Actually it is a standard for Quality Management Systems. It does not tell you what attributes your product should have - that's between you and your consumer/customer. OSM doesn't really have any way of assessing its product against desired attributes. How do you think OSM's product should be measured for these purposes? At the moment it is very subjective - "good" is anything which is not considered "bad", and "bad" means shouted down by a few people on a mailing list and/or vetoed by the DWG in a sort of Star Chamber process. > 'Agreed'? Buy whom? OSM can have new tags introduced by anyone. The reality > of this is that tags that get used frequently by a number of mappers get > 'recognised'. Agreed between producer and consumer. Our definition of quality will not include a limitation to ONLY use certain tags, implying that it is the consumer's responsibility to ignore arbitrary tags. What are our consumer's expectations? What (apart from product price) will drive their decision to use OSM instead of other sources? > Tags that get 'approved' by the tagging group get the status=approved thing, > those rejected get the status=rejected .. but even the rejected tags get > used, some even advocate their use. > One can take the attitude that at least these tags have been review by some, > compared to tags that are simply added by one person without review. > > Compliance .. with what? The wiki documented tags? Those can be added by > anyone. As there is no scheme/philosophy for OSM .. then you have nothing to > comply to that cannot be changed so easily that it is not worth the effort. Compliance with the agreed "specifications." Once again, we don't have a good definition of "quality" for OSM data, so we cannot use that to judge whether data is "good" or "bad", or, put another way, "compliant" or "non-compliant". So what dimensions could we apply to OSM data to assess its quality? I am just throwing some ideas in the mix here, this is not my "answer". In all cases please imagine the words "to what extent" at the start of the sentence. Completeness * Is the data complete, given its intended scope? For example, do we have ALL the train stations in the UK? * Correctness Are there any typos in the tagging? Is a train station not tagged as a tram stop? Is the use of those tags which are documented, in line with the documentation? * Consistency Is the tagging consistent, across its intended applicable domain? (I intend to suggest that it is probably impossible to get tagging consistent across the whole world, but within a country for example it should most definitely be achievable) * Timeliness Is the data still valid today? Or to make it "SMART", how long ago was the data reviewed? Different things will need different standards here - some things are obviously more volatile than others. * Verifiability Did the date come from a suitably licenced source? Is the data verifiable by an independent member of the public without any legal privilege? * Consumability Is the data represented and made available in a way which facilitates its use? For example, dates in arbitrary local formats would not be compliant here. We might not be too happy with tags using non-Latin characters. The use of XML is good, but it's a shame we don't have even a basic XSD yet (I am working on this though) All this might tell us how the data scores, but it doesn't tell us what we should consider "good enough". In some cases we can expect to get close to 100% (e.g. train stations in the UK), but all sorts of factors will keep the score below 100% in practice (like when a new station opens, it MAY take a long time to find its way into OSM. In the mean time we are down to 99.9%). In other cases, we might be ecstatic if 15% of the data was entered/reviewed in the last 5 years. --colin
_______________________________________________ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk