Thought the list might be interested in latest developments on TheyWorkForYou :: Local, which has been developing nicely in the past couple of weeks.
I have been working on adding both breadth and depth. First the breadth. I have now more than doubled the number of councils scraped/parsed to 40. The trick is knowing what CMS the council uses -- I've improved the Modern.gov parser a bit (the most popular CMS used by councils to store the 'democracy' data) and added scrapers/parsers for CMIS (the second most popular one), as well as doing a few quick dedicated scrapers/parsers for councils with bespoke systems. I've also added depth, with the parsers now picking up links to declarations of interests, and scraping meeting minutes where they are available as HTML (next stage is to start converting some of the PDFs), and extracting council's main RSS newsfeed link (Hat-Tip: MashTheState.com). I've also started incorporating external datasets -- it's still at an early, experimental stage (i.e. I may change the implementation or API), but like the scraper/parser solution, it is easy to add additional ones in 10 minutes or so. The first couple of datasets are the recent Happiness Survey 2008 by the Dept of Comm & Local Govt (using the data on the Guardian's datastore) and the 2006/07 Local Govt Pension Funds Data (extracted by me from Form SF3 -- obviously predates not just the crash but also the last knockings of the boom, but is AFAIK the most recent dataset available). Like some of the meeting info, not all councils have all data -- I have to programmatically work out the body referred to, as there's no consistency in naming or use of IDs. Let me know if there are other datasets you think I should include. Finally, I've added basic timed scraping, so the scrapers are run regularly to pick up on changes. As ever, comments and suggestions welcome. It's still at an early stage, but coming along reasonably well (I think). I'm particularly keen (before I get stuck into it) to hear about what feeds people would find useful (using feeds to start off with rather than email alerts to keep it simple) -- council meetings as iCal (ics) feed? latest meeting minutes for a council? Cheers C _______________________________________________ Mailing list [email protected] Archive, settings, or unsubscribe: https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
