Hi Simon,
I've also put together an XML API too, which CountCulture has used to populate TWFY Local, but I'm especially interested in properly getting my head round RDFa, especially with Tim Berners-Lee being so involved with Digital Britain et al. I imagine he's going to be really keen to push RDFa as a way of getting gov't data out there. I'm definitely going to be pushing our suppliers down this route too, and where there's difficulties, doing it myself! I'm not Lichfield on Twitter, that's our Tourism team. I run @lichfield_dc and my personal account is @pezholio stuart harrison webmaster lichfield district council www.lichfielddc.gov.uk <http://www.lichfielddc.gov.uk/> 01543 308779 [email protected] -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Simon Gibbs Sent: 06 July 2009 11:55 To: mySociety public,general purpose discussion list Subject: Re: [mySociety:public] Putting Government Data online That's great news! I thought it would be most efficient for the interested Councils to co-ordinate and strong arm the dominant CMS vendors, but obviously any approach that works is good. Returning to the present, if you are putting RDF on-line at all, then CountCulture should be able to scrape and SPARQL very easily and get back to tabular data. Happy to help with that also. By the way, are you lichfield on twitter? Simon Harrison, Stuart wrote: Thanks Simon, that's really useful. I've already made an (admittedly cack-handed) effort to RDFa-ise our council member's pages, so this will be a big help: Cheers stuart harrison webmaster lichfield district council www.lichfielddc.gov.uk <http://www.lichfielddc.gov.uk/> 01543 308779 [email protected] -----Original Message----- From: [email protected] [ mailto:[email protected]] On Behalf Of Simon Gibbs Sent: 04 July 2009 11:59 To: [email protected]; mySociety public, general purpose discussion list Subject: Re: [mySociety:public] Putting Government Data online CountCulture wrote: I'm the dev behind TheyWorkForYouLocal.com I've used microformats before, so am fairly comfortable with that, but have been thinking about using RDFa for this project -- partly as a learning experience, and partly because (based on what little I know about RDFa) I'm thinking it might make more sense as only a fraction of the data falls into microformat-type, I'm thinking that public data like this may already have some sort of RDFa schema (if that's the right expression0. If you can point me in the right direction for RDFa stuff, that'd be great. Hi again I've put some demo pages online, try out the following URLs. I haven't tackled a minutes page, and the copies I took pre-date happiness stats, which is a bit of a shame. http://cantorva.com/2009NS/twfyl-lod-demo/councils/45.htm http://cantorva.com/2009NS/twfyl-lod-demo/members/1443.htm http://cantorva.com/2009NS/twfyl-lod-demo/committees/771.htm http://cantorva.com/2009NS/twfyl-lod-demo/meetings/4688.htm http://cantorva.com/2009NS/twfyl-lod-demo/meetings-qm-council-id-eq-45.h tm I blogged some guidance about how to actually see the data: http://cantorva.com/blog/2009/07/01/hints-on-browsing-embedded-rdfa-data -as-data/ The vocabulary is mostly FOAF, with Dublin Core, iCal and a bit of core RDF stuff. There are a few places where I couldn't locate a term, so I invented some and documented them in (also in RDFa): http://cantorva.com/2009NS/twfyl-lod-demo/vocab As well as keeping focused on the goal of creating an API for Linked Data hackers rather than fodder for SEO purposes (so no Google RDFa) - I made a few assumptions and judgement calls as I went along: Provenance is important to you - the clue was you put last modified dates in the XML as well as the page, and gave your pseudonym and homepage on all the pages. The new XML related to happiness etc goes into more depth on provenance so this is borne out. This lead to... The pages and the entities are different things - e.g. CountCulture did not make Brighton and Hove City Council he made a page about it. This meant adding #disambiguator to the end of each URL when talking about councils, committees etc. and leaving it plain when talking about pages. Logic nerds will love you for this since Document != Person and some tools barf when data implies otherwise. This turned into a bit of a pain, and that is optional pain, but frankly Document!=Person appeals intuitively. Its OK to talk about more than one entity per page- if its useful for a user to see something then some application may also want it. The meeting example is a good one for talking about just about everything else. Even if there is better data on other pages I marked up what was there. This should mean fewer HTTP GETs for some apps at the expense of a little bloat, and makes for additional machine readable links between entities. You don't want to change your markup - its possible to put more data into some pages, notably the name of councils on meeting pages, but I left the mark up mostly as it was. There are lots of extra spans and one extra div, then the actual RDFa attributes. There may have been some mangling done by Firefox as I saved out the pages, so if something is changed that makes no sense ask and I can confirm why it changed. You are not going to want two RDF vocabularies - so I made a few concessions to re-use by being deliberately vague in places. e.g. the vocab for "committee" does not link a council to a committee it actually links Organization and Group (from FOAF). To be honest, I don't think many people will use (or notice) the machine readable vocabulary anyway so I tried not to over think it or research every option or entity, notably not all classes of local authority are documented. Tabulator usability is also important - I put a bit of effort in to make sure Tabulator displayed the data as nicely as possible. This means the #disambiguator bits are actually structured and means extra rdfs:label properties on entities. I figure if Tabulator uses that stuff other apps will too, plus you don't want to look at horrid presentations. OK that's it, that pretty much tells you what is there and why. Let me know your thoughts. Simon This e-mail and any attachment(s), is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the addressee, dissemination, copying or use of this e-mail or any of its content is prohibited and may be unlawful. If you are not the intended recipient please inform the sender immediately and destroy the e-mail, any attachment(s) and any copies. All liability for viruses is excluded to the fullest extent permitted by law. It is your responsibility to scan or otherwise check this email and any attachment(s). Unless otherwise stated (i) views expressed in this message are those of the individual sender (ii) no contract may be construed by this e-mail. Emails may be monitored and you are taken to consent to this monitoring. ________________________________ _______________________________________________ Mailing list [email protected] Archive, settings, or unsubscribe: https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-pub lic This e-mail and any attachment(s), is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the addressee, dissemination, copying or use of this e-mail or any of its content is prohibited and may be unlawful. If you are not the intended recipient please inform the sender immediately and destroy the e-mail, any attachment(s) and any copies. All liability for viruses is excluded to the fullest extent permitted by law. It is your responsibility to scan or otherwise check this email and any attachment(s). Unless otherwise stated (i) views expressed in this message are those of the individual sender (ii) no contract may be construed by this e-mail. Emails may be monitored and you are taken to consent to this monitoring.
<<image001.jpg>>
<<image002.jpg>>
_______________________________________________ Mailing list [email protected] Archive, settings, or unsubscribe: https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
