Actually, much of the problem with the spreadsheets is that they are used
'presentationally' in many cases, with long column headings, footnotes,
straddle cells, mixed data formats (dates and strings in the same column),
amongst others.

This is really problematic for us database nerds who just want to get the
data in and serve it back out.

Cleanup can happen afterwards. Adding a "I would like to report an error in
the data" link to the bottom of a web page is a pretty trivial development
task.

We (and the media) are being given the data in a format that suits the civil
servants, not us (or the public).

Just my opinion
Feargal Hogan



  Harry says: "standardised format, on a web page"
  Sam says: "but if it has changed, you need to check it"


  I'm trying to do something about EU farm subsidy data, and I can vouch for
that government agencies don't mind the data receiver. Worst is that the
cleanup I'll do is not going back into their system, since their "system" is
a bunch of excel files, so they can't do any typical database operation.
(Otherwise I could just send them my SQL, and they'd get some technician to
help them -- maybe even give them an "auto-cleanup" script in an Access
database which they can use at the click of a button. They'd get a preview
and an OK/Cancel choice.)


  Back on topic:
  So the small problem is getting the overall structure right, and the big
problem is data quality? Even if the standard would dictate fields like
Firstname, Surname - the data still comes in with things mixed up? And the
current solution is that each mass-media needs to spend days to clean up the
data? This seems wasteful - it must be possible to get the data right in the
first place?


  Assuming that the newspaper readers only look at the top and bottom of the
lists, the article author should probably phone these schools in advance and
allow them to comment on their placement. If schools can't comment within a
day, maybe they can publish school comments some days later.


  What else is this "checking" media have to do, which couldn't more
efficiently be handled centrally?

  Thanks for reading
  /Simon
_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Reply via email to