Actually, much of the problem with the spreadsheets is that they are used 'presentationally' in many cases, with long column headings, footnotes, straddle cells, mixed data formats (dates and strings in the same column), amongst others.
This is really problematic for us database nerds who just want to get the data in and serve it back out. Cleanup can happen afterwards. Adding a "I would like to report an error in the data" link to the bottom of a web page is a pretty trivial development task. We (and the media) are being given the data in a format that suits the civil servants, not us (or the public). Just my opinion Feargal Hogan Harry says: "standardised format, on a web page" Sam says: "but if it has changed, you need to check it" I'm trying to do something about EU farm subsidy data, and I can vouch for that government agencies don't mind the data receiver. Worst is that the cleanup I'll do is not going back into their system, since their "system" is a bunch of excel files, so they can't do any typical database operation. (Otherwise I could just send them my SQL, and they'd get some technician to help them -- maybe even give them an "auto-cleanup" script in an Access database which they can use at the click of a button. They'd get a preview and an OK/Cancel choice.) Back on topic: So the small problem is getting the overall structure right, and the big problem is data quality? Even if the standard would dictate fields like Firstname, Surname - the data still comes in with things mixed up? And the current solution is that each mass-media needs to spend days to clean up the data? This seems wasteful - it must be possible to get the data right in the first place? Assuming that the newspaper readers only look at the top and bottom of the lists, the article author should probably phone these schools in advance and allow them to comment on their placement. If schools can't comment within a day, maybe they can publish school comments some days later. What else is this "checking" media have to do, which couldn't more efficiently be handled centrally? Thanks for reading /Simon
_______________________________________________ Mailing list [email protected] Archive, settings, or unsubscribe: https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
