Alex

This looks useful (but complex). Any chance you could point me in the 
direction of which of the 15 spreadsheets (in the Excel file - don't 
have Access) contains the Local Authorities, and which is the primary key?

Cheers
C

Alex Skene wrote:
> We should probably use SNAC codes as the identifier for local authorities
> <<http://www.ons.gov.uk/about-statistics/geography/products/geog-products-area/snac/index.html>>
>
> Cheers
> Alex
>
> 2009/6/19 Francis Irving <[email protected]>:
>   
>> (copied to WhatDoTheyKnow team)
>>
>> Anyone here know about identifiers for local authorities?
>>
>> I'm inclined to use Wikipedia article ids, as that will extend to
>> other authorities as well.
>>
>> Francis
>>
>> On Thu, Jun 18, 2009 at 11:44:12AM +0100, CountCulture wrote:
>>     
>>> Francis
>>> Thought it might be useful if twfylocal could show status of WDTK
>>> requests (total, recent, no answered, outstanding late etc), with basic
>>> details of requests (though prob makes sense to go to WDTK site for full
>>> details of request).
>>>
>>> Re id system, it's something I've been struggling with as everywhere
>>> uses a different system, so at the moment each twfylocal council record
>>> stores the following ids/refs:
>>>
>>> :id (integer, twfy_local internal primary id. WON'T CHANGE)
>>> :name (string, as scraped from eGR, though with some minor edits)
>>> :wikipedia_url (string, as scraped from eGR, though have already found
>>> one mistake)
>>> :ons_url (string)
>>> :egr_id (integer, this is most useful as it gives links to loads of
>>> other things -- e.g. various gov pages -- doesn't change AFAIK even if
>>> the authority name does)
>>> :wdtk_name (string, from scraping WDTK and trying to match against
>>> shortened version of name -- successful about 80% of the time)
>>>
>>> Had a look at the WDTK code and I seem to remember the internal primary
>>> id is exposed in at least one place, but that it didn't help as you
>>> couldn't do queries by it. What we could really do with is a canonical
>>> id for each authority.
>>>
>>> FWIW you can use the eGR on twfylocal, though it adds an extra step (if
>>> you go to theyworkforyoulocal.com/councils.xml it returns all the
>>> councils together with their ids and the eGR ids. If you could match
>>> WDTK with eGR ids (for example) and make the match available
>>> programmatically would have the beginnings of a makeshift common id.
>>>
>>> Thoughts?
>>>
>>>
>>> Francis Irving wrote:
>>>       
>>>> There are RSS feeds of latest responses, including quite fancy ones if
>>>> you use advanced search keywords. They only give extracts from the new
>>>> messages though. What exact information are you trying to get?
>>>>
>>>> There is no structured way to get status or similar out of the site.
>>>>
>>>> Finally, we could agree an id system for name matching. I'd quite like
>>>> in a way to mark every authority with, say, its identifier in
>>>> Wikipedia, to aid merging with other databases.
>>>>
>>>> What identifiers are you using in your system?
>>>>
>>>> Francis
>>>>
>>>> On Wed, Jun 17, 2009 at 03:05:26PM +0200, Tom Steinberg wrote:
>>>>
>>>>         
>>>>> Hi,
>>>>>
>>>>> I'm afraid I don't know, but I've CCed the team who look after WDTK to 
>>>>> ask.
>>>>>
>>>>> Tom
>>>>>
>>>>> 2009/6/17 CountCulture <[email protected]>:
>>>>>
>>>>>           
>>>>>> Tom
>>>>>> Follow up question. At the moment I've got a link to the What Do They 
>>>>>> Know
>>>>>> page for the council. Any probs with including more info from WDTK such 
>>>>>> as
>>>>>> status, and latest responses, and is there a good way to get that other 
>>>>>> than
>>>>>> scraping the data ( had a look at the code and there didn't really seem 
>>>>>> to
>>>>>> be)?
>>>>>> Cheers
>>>>>> C
>>>>>>
>>>>>> -------- Original Message --------
>>>>>>
>>>>>> Tom
>>>>>>
>>>>>> Digging deeper is actually where I'd intended to go first, but when I
>>>>>> started to explore some of the council websites I found that even shallow
>>>>>> data was problematic and reckoned I needed a API and structure that at 
>>>>>> the
>>>>>> very least could cope with those variants (and reuse the scrapers/parsers
>>>>>> once written) -- hence the proof-of-concept nature.
>>>>>>
>>>>>> However, now I've got the basics worked out (though there's still 
>>>>>> tweaking
>>>>>> and issues to be done there), delving deeper's the next step. In 
>>>>>> particular,
>>>>>> working out the best way of finding/storing/parsing council docs (which 
>>>>>> are
>>>>>> often unstructured PDFs, sometimes even just PDFs which are just scans), 
>>>>>> and
>>>>>> also working out an elegant way of linking with other data sources.
>>>>>>
>>>>>> Thanks for the kind words, I'll keep the list updated with major
>>>>>> developments, or you can always watch the github repository.
>>>>>>
>>>>>> Cheers
>>>>>> C
>>>>>>
>>>>>> Tom Steinberg wrote:
>>>>>>
>>>>>>             
>>>>>>> Hi there,
>>>>>>>
>>>>>>> Cool - great to see people hacking on councils, it's been something
>>>>>>> I've wanted to see for ages.
>>>>>>>
>>>>>>> I see you've gone straight for getting the councillors of several
>>>>>>> different councils, but I'd actually suggest going deeper rather than
>>>>>>> wider. Why not just dive deep into one council and see if you can get
>>>>>>> transcripts or other documents nicely scraped and parsed? I'd love to
>>>>>>> see at least a handful of councils in TheyWorkForYou proper by the end
>>>>>>> of the year.
>>>>>>>
>>>>>>> Well done anyway!
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>> 2009/6/16 CountCulture <[email protected]>:
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Quick note about something I've been working on in my spare time:
>>>>>>>>
>>>>>>>> http://theyworkforyoulocal.com -- a small app to scrape and parse local
>>>>>>>> authority info.
>>>>>>>>
>>>>>>>> At the moment, it's barely more than a proof of concept, with only 
>>>>>>>> about
>>>>>>>> 20 or so councils parsed, and even then only current councillors,
>>>>>>>> committees, committee membership and forthcoming meetings are parsed.
>>>>>>>>
>>>>>>>> On the upside, it's fairly quick for me to add new parsers for councils
>>>>>>>> (and reuse ones already written if they use same CMS), there's an API
>>>>>>>> built in (basically just add .json or .xml to get the info as json or
>>>>>>>> XML), and there's lots of potential.
>>>>>>>>
>>>>>>>> Getting this far has also been an education in understanding what a
>>>>>>>> full-blown twfy_local might look like (in general there seems no way to
>>>>>>>> see how councillors voted, for example), the need for such a resource
>>>>>>>> (there's no publicly available central repository for council election
>>>>>>>> results, for example), and the sorry state of local authority websites
>>>>>>>> (just finding a list of councillors is a challenge on some, and don't
>>>>>>>> get me started on the HTML markup).
>>>>>>>>
>>>>>>>> Comments welcome. Code is at
>>>>>>>> http://github.com/CountCulture/twfy_local_parser/ (I'll probably GPL it
>>>>>>>> soon). Bug reports at
>>>>>>>> http://github.com/CountCulture/twfy_local_parser/issues and offers of
>>>>>>>> help to countculture at googlemail dot com.
>>>>>>>>
>>>>>>>> I'd especially be interested in hearing from anyone who's got any
>>>>>>>> knowledge about local authority CMSs (e.g. there seem to be several
>>>>>>>> different versions of Modern.Gov producing different URLs), or sources
>>>>>>>> for more data other than the local authority websites (e.g. eGR,
>>>>>>>> info4local).
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>> C
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list [email protected]
>>>>>>>> Archive, settings, or unsubscribe:
>>>>>>>>
>>>>>>>> https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>
>>>>>>             
>>>>         
>>>       
>
>   


_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Reply via email to