Thought the list might be interested in latest developments on 
TheyWorkForYou :: Local, which has been developing nicely in the past 
couple of weeks.

I have been working on adding both breadth and depth.

First the breadth. I have now more than doubled the number of councils 
scraped/parsed to 40. The trick is knowing what CMS the council uses -- 
I've improved the Modern.gov parser a bit (the most popular CMS used by 
councils to store the 'democracy' data) and added scrapers/parsers for 
CMIS (the second most popular one), as well as doing a few quick 
dedicated scrapers/parsers for councils with bespoke systems.

I've also added depth, with the parsers now picking up links to 
declarations of interests, and scraping meeting minutes where they are 
available as HTML (next stage is to start converting some of the PDFs), 
and extracting council's main RSS newsfeed link (Hat-Tip: MashTheState.com).

I've also started incorporating external datasets -- it's still at an 
early, experimental stage (i.e. I may change the implementation or API), 
but like the scraper/parser solution, it is easy to add additional ones 
in 10 minutes or so.

The first couple of datasets are the recent Happiness Survey 2008 by the 
Dept of Comm & Local Govt (using the data on the Guardian's datastore) 
and the 2006/07 Local Govt Pension Funds Data (extracted by me from Form 
SF3 -- obviously predates not just the crash but also the last knockings 
of the boom, but is AFAIK the most recent dataset available).

Like some of the meeting info, not all councils have all data -- I have 
to programmatically work out the body referred to, as there's no 
consistency in naming or use of IDs. Let me know if there are other 
datasets you think I should include.

Finally, I've added basic timed scraping, so the scrapers are run 
regularly to pick up on changes.

As ever, comments and suggestions welcome. It's still at an early stage, 
but coming along reasonably well (I think).

I'm particularly keen (before I get stuck into it) to hear about what 
feeds people would find useful (using feeds to start off with rather 
than email alerts to keep it simple) -- council meetings as iCal (ics) 
feed? latest meeting minutes for a council?


Cheers
C


_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Reply via email to