Dear all, following up on my earlier email, I just pushed a list of pincodes for all electoral booths across India to GitHub and made a pull request to the datameet repository:
https://github.com/datameet/pincodes/pull/2 Please note that this can be incomplete, and is based on a rather brutish, quick and dirty hack - see comments in rolls2pincode.pl. But it does use the same IDs as those in the 2014 elections, and hence can be combined with my GIS shapefiles for polling booths: http://dx.doi.org/10.4119/unibi/2674065 I leave it to others to double-check accuracy and create actual pincode maps. I hope this is useful, Best, Raphael On 28.03.2016 07:50, Raphael Susewind wrote: > Dear Avinash and all, > > I will try to make some time this week to scrape the pincodes from > electoral rolls for all polling booths in my electoral GIS shapefiles. > > Since pincode is in latin script, this should not be affected by the > much discussed PDF scraping issues with electoral rolls. > > We could then either go down the voronoi route, or alternatively use the > heatmap processing chain that I used to generate AC boundaries - this > latter would have the advantage of dealing with wrong coordinates in the > booth point dataset (basically, not all electoral booth coordinates are > correct; consequently, if we only voronoi, we would have a blip of > pincode B within a see of pincode A quite frequently. The heatmap stuff > takes care of this). > > Since I am not familiar with postal boundaries: can anyone here confirm > whether pincode areas are contiguous, and whether each pincode has only > one area? Or can it be that several non-contiguous areas have the same > pincodem intersparsed with other pincodes? (In which case voronoi would > perhaps be the better solution at last) > > In any case, I hope to give you the pincode for each polling booth by > end of the week or so (based on all-India 2014 electoral rolls), > > Best, > Raphael > > On 28.03.2016 06:33, Avinash Celestine wrote: > >> perhaps one way is to avoid using postal data altogether. >> >> All header pages in electoral rolls(the first page) contain the name of >> the polling station related to that roll, the PS number, and importantly >> the pin code. >> >> A site like psleci.nic.in <http://psleci.nic.in> has geog coordinates >> of polling stations (though Raphael had collected the data earlier*). >> Matching the two will give a fairly dense scattering of points - in >> fact much more dense than if we used some of the methods earlier in this >> thread. >> >> We thus have a way of associating a pin code with a geo coordinate. We >> can then use the voronoi method. >> >> Electoral rolls are mostly in pdf which make them difficult to scrape. >> But from what i have seen, for any given state, the location on the >> header page, of the pincode number is more or less constant, making it >> possible to target just that part of the page with any pdf parser. >> >> Electoral rolls have become difficult to download in bulk( a good >> thing!) but i understand different people on this group have the pdfs >> for different states. Putting this stuff together should give us >> comprehensive data on header pages for atleast some states. >> Alternatively, we can file RTIs for just the header pages of electoral >> rolls, though i dont know how successful that would be. >> >> * Raphael's data is >> at https://github.com/raphael-susewind/india-election-data >> >> >> >> On Sun, Mar 27, 2016 at 12:07 PM, srinivas kodali <[email protected] >> <mailto:[email protected]>> wrote: >> >> Well, There were postal delivery zones in the past and the postal >> department even used to make maps of these zones. The Delhi postal >> delivery zone map >> >> <https://drive.google.com/file/d/0B1RcWLku0ZOWWVBHMldrZWdfZEU/view?usp=sharing> >> had >> boundaries for delhi. I am not sure if other cities had them or how >> long the postal department was doing this, but it certainly can help >> with the boundaries for cities. >> >> Regards, >> Srinivas Kodali >> www.lostprogrammer.com <http://www.lostprogrammer.com> >> /"Not everyone who wanders is lost, I am probably a bit"/ >> >> On Tue, Mar 22, 2016 at 9:29 PM, Arun Ganesh <[email protected] >> <mailto:[email protected]>> wrote: >> >> Shravan, crowdsourcing the boundaries of pincodes is not as >> trivial as you think. To start with, an area does not fall under >> a pincode, rather a street does based on the post office that >> services it. Read >> this: http://www.georeference.org/doc/zip_codes_are_not_areas.htm >> >> You may also want to do some background reading of existing >> research that has been done by the group >> here: https://datameet.hackpad.com/M4hPFJVV2Gm?eid=v4YoXN4tTw5 >> >> To sum up, nobody has precise pincode boundaries like how you >> imagine them, not even the postal department. Any existing >> datasets are an estimate at best using some data processing on a >> large volume of address data. >> >> -- >> Datameet is a community of Data Science enthusiasts in India. >> Know more about us by visiting http://datameet.org >> --- >> You received this message because you are subscribed to the >> Google Groups "datameet" group. >> To unsubscribe from this group and stop receiving emails from >> it, send an email to [email protected] >> <mailto:[email protected]>. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> Datameet is a community of Data Science enthusiasts in India. Know >> more about us by visiting http://datameet.org >> --- >> You received this message because you are subscribed to the Google >> Groups "datameet" group. >> To unsubscribe from this group and stop receiving emails from it, >> send an email to [email protected] >> <mailto:[email protected]>. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> Datameet is a community of Data Science enthusiasts in India. Know more >> about us by visiting http://datameet.org >> --- >> You received this message because you are subscribed to the Google >> Groups "datameet" group. >> To unsubscribe from this group and stop receiving emails from it, send >> an email to [email protected] >> <mailto:[email protected]>. >> For more options, visit https://groups.google.com/d/optout. > -- Dr Raphael Susewind | Associate, Contemporary South Asia Studies, Oxford Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany Web & Twitter | https://www.raphael-susewind.de | @RaphaelSusewind Impact | https://impactstory.org/raphael-susewind Please consider https://www.gnupg.org for encryption (key id 10AEE42F) -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
