Dear all,

following up on my earlier email, I just pushed a list of pincodes for
all electoral booths across India to GitHub and made a pull request to
the datameet repository:

https://github.com/datameet/pincodes/pull/2

Please note that this can be incomplete, and is based on a rather
brutish, quick and dirty hack - see comments in rolls2pincode.pl. But it
does use the same IDs as those in the 2014 elections, and hence can be
combined with my GIS shapefiles for polling booths:

http://dx.doi.org/10.4119/unibi/2674065

I leave it to others to double-check accuracy and create actual pincode
maps. I hope this is useful,

Best,
Raphael

On 28.03.2016 07:50, Raphael Susewind wrote:

> Dear Avinash and all,
> 
> I will try to make some time this week to scrape the pincodes from
> electoral rolls for all polling booths in my electoral GIS shapefiles.
> 
> Since pincode is in latin script, this should not be affected by the
> much discussed PDF scraping issues with electoral rolls.
> 
> We could then either go down the voronoi route, or alternatively use the
> heatmap processing chain that I used to generate AC boundaries - this
> latter would have the advantage of dealing with wrong coordinates in the
> booth point dataset (basically, not all electoral booth coordinates are
> correct; consequently, if we only voronoi, we would have a blip of
> pincode B within a see of pincode A quite frequently. The heatmap stuff
> takes care of this).
> 
> Since I am not familiar with postal boundaries: can anyone here confirm
> whether pincode areas are contiguous, and whether each pincode has only
> one area? Or can it be that several non-contiguous areas have the same
> pincodem intersparsed with other pincodes? (In which case voronoi would
> perhaps be the better solution at last)
> 
> In any case, I hope to give you the pincode for each polling booth by
> end of the week or so (based on all-India 2014 electoral rolls),
> 
> Best,
> Raphael
> 
> On 28.03.2016 06:33, Avinash Celestine wrote:
> 
>> perhaps one way is to avoid using postal data altogether.
>>
>> All header pages in electoral rolls(the first page) contain the name of
>> the polling station related to that roll, the PS number, and importantly
>> the pin code.
>>
>>  A site like psleci.nic.in <http://psleci.nic.in> has geog coordinates
>> of polling stations (though Raphael had collected the data earlier*).
>> Matching the two will give a fairly dense scattering of points  - in
>> fact much more dense than if we used some of the methods earlier in this
>> thread.
>>
>> We thus have a way of associating a pin code with a geo coordinate. We
>> can then use the voronoi method.
>>
>> Electoral rolls are mostly in pdf which make them difficult to scrape.
>> But from what i have seen, for any given state, the location on the
>> header page, of the pincode number is more or less constant, making it
>> possible to target just that part of the page with any pdf parser.
>>
>> Electoral rolls have become difficult to download in bulk( a good
>> thing!) but i understand different people on this group have the pdfs
>> for different states. Putting this stuff together should give us
>> comprehensive data on header pages for atleast some states.
>> Alternatively, we can file RTIs for just the header pages of electoral
>> rolls, though i dont know how successful that would be.
>>
>> * Raphael's data is
>> at https://github.com/raphael-susewind/india-election-data
>>
>>
>>
>> On Sun, Mar 27, 2016 at 12:07 PM, srinivas kodali <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     Well, There were postal delivery zones in the past and the postal
>>     department even used to make maps of these zones. The Delhi postal
>>     delivery zone map
>>     
>> <https://drive.google.com/file/d/0B1RcWLku0ZOWWVBHMldrZWdfZEU/view?usp=sharing>
>>  had
>>     boundaries for delhi. I am not sure if other cities had them or how
>>     long the postal department was doing this, but it certainly can help
>>     with the boundaries for cities.
>>
>>     Regards,
>>     Srinivas Kodali
>>     www.lostprogrammer.com <http://www.lostprogrammer.com>
>>     /"Not everyone who wanders is lost, I am probably a bit"/
>>
>>     On Tue, Mar 22, 2016 at 9:29 PM, Arun Ganesh <[email protected]
>>     <mailto:[email protected]>> wrote:
>>
>>         Shravan, crowdsourcing the boundaries of pincodes is not as
>>         trivial as you think. To start with, an area does not fall under
>>         a pincode, rather a street does based on the post office that
>>         services it. Read
>>         this: http://www.georeference.org/doc/zip_codes_are_not_areas.htm
>>
>>         You may also want to do some background reading of existing
>>         research that has been done by the group
>>         here: https://datameet.hackpad.com/M4hPFJVV2Gm?eid=v4YoXN4tTw5
>>
>>         To sum up, nobody has precise pincode boundaries like how you
>>         imagine them, not even the postal department. Any existing
>>         datasets are an estimate at best using some data processing on a
>>         large volume of address data.
>>
>>         -- 
>>         Datameet is a community of Data Science enthusiasts in India.
>>         Know more about us by visiting http://datameet.org
>>         ---
>>         You received this message because you are subscribed to the
>>         Google Groups "datameet" group.
>>         To unsubscribe from this group and stop receiving emails from
>>         it, send an email to [email protected]
>>         <mailto:[email protected]>.
>>         For more options, visit https://groups.google.com/d/optout.
>>
>>
>>     -- 
>>     Datameet is a community of Data Science enthusiasts in India. Know
>>     more about us by visiting http://datameet.org
>>     ---
>>     You received this message because you are subscribed to the Google
>>     Groups "datameet" group.
>>     To unsubscribe from this group and stop receiving emails from it,
>>     send an email to [email protected]
>>     <mailto:[email protected]>.
>>     For more options, visit https://groups.google.com/d/optout.
>>
>>
>> -- 
>> Datameet is a community of Data Science enthusiasts in India. Know more
>> about us by visiting http://datameet.org
>> ---
>> You received this message because you are subscribed to the Google
>> Groups "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to [email protected]
>> <mailto:[email protected]>.
>> For more options, visit https://groups.google.com/d/optout.
> 

-- 
Dr Raphael Susewind | Associate, Contemporary South Asia Studies, Oxford
         Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
      Web & Twitter | https://www.raphael-susewind.de | @RaphaelSusewind
             Impact | https://impactstory.org/raphael-susewind

Please consider https://www.gnupg.org for encryption (key id 10AEE42F)
        

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to