Re: [datameet] Pincode Boundaries of India

2016-04-02 Thread Raphael Susewind
Hi Dev,

there are state/state.boothraw.* shapefiles, these should contain the
raw polling booth locations.

Heatmap scripts are terribly customized - I would have to look into this
myself, I am afraid, which could take some time (very busy)

You would have to go with voronois for now, sorry,

Best,
Raphael

On 02.04.2016 09:02, Devdatta Tengshe wrote:
> Hi Raphael,
> 
> Firstly, thanks a lot for extracting this information.
> 
> I was looking at http://dx.doi.org/10.4119/unibi/2674065, but I could
> find only the Boundaries for the constituencies.
> 
> Can you tell us where we can find the locations of the polling booths
> that you had extracted?
> 
> Secondly, can you also share (if you still have them) the heatmaps code
> that you used to create the constituency boundaries? I think that is
> what will be required to create the pincode boundaries as well.
> 
> Regards,
> Dev
> 
> Regards,
> Devdatta
> 
> On Fri, Apr 1, 2016 at 6:31 PM, Raphael Susewind
> mailto:li...@raphael-susewind.de>> wrote:
> 
> Dear all,
> 
> following up on my earlier email, I just pushed a list of pincodes for
> all electoral booths across India to GitHub and made a pull request to
> the datameet repository:
> 
> https://github.com/datameet/pincodes/pull/2
> 
> Please note that this can be incomplete, and is based on a rather
> brutish, quick and dirty hack - see comments in rolls2pincode.pl
> . But it
> does use the same IDs as those in the 2014 elections, and hence can be
> combined with my GIS shapefiles for polling booths:
> 
> http://dx.doi.org/10.4119/unibi/2674065
> 
> I leave it to others to double-check accuracy and create actual pincode
> maps. I hope this is useful,
> 
> Best,
> Raphael
> 
> On 28.03.2016 07:50, Raphael Susewind wrote:
> 
> > Dear Avinash and all,
> >
> > I will try to make some time this week to scrape the pincodes from
> > electoral rolls for all polling booths in my electoral GIS shapefiles.
> >
> > Since pincode is in latin script, this should not be affected by the
> > much discussed PDF scraping issues with electoral rolls.
> >
> > We could then either go down the voronoi route, or alternatively
> use the
> > heatmap processing chain that I used to generate AC boundaries - this
> > latter would have the advantage of dealing with wrong coordinates
> in the
> > booth point dataset (basically, not all electoral booth
> coordinates are
> > correct; consequently, if we only voronoi, we would have a blip of
> > pincode B within a see of pincode A quite frequently. The heatmap
> stuff
> > takes care of this).
> >
> > Since I am not familiar with postal boundaries: can anyone here
> confirm
> > whether pincode areas are contiguous, and whether each pincode has
> only
> > one area? Or can it be that several non-contiguous areas have the same
> > pincodem intersparsed with other pincodes? (In which case voronoi
> would
> > perhaps be the better solution at last)
> >
> > In any case, I hope to give you the pincode for each polling booth by
> > end of the week or so (based on all-India 2014 electoral rolls),
> >
> > Best,
> > Raphael
> >
> > On 28.03.2016 06:33, Avinash Celestine wrote:
> >
> >> perhaps one way is to avoid using postal data altogether.
> >>
> >> All header pages in electoral rolls(the first page) contain the
> name of
> >> the polling station related to that roll, the PS number, and
> importantly
> >> the pin code.
> >>
> >>  A site like psleci.nic.in 
>  has geog coordinates
> >> of polling stations (though Raphael had collected the data earlier*).
> >> Matching the two will give a fairly dense scattering of points  - in
> >> fact much more dense than if we used some of the methods earlier
> in this
> >> thread.
> >>
> >> We thus have a way of associating a pin code with a geo
> coordinate. We
> >> can then use the voronoi method.
> >>
> >> Electoral rolls are mostly in pdf which make them difficult to
> scrape.
> >> But from what i have seen, for any given state, the location on the
> >> header page, of the pincode number is more or less constant,
> making it
> >> possible to target just that part of the page with any pdf parser.
> >>
> >> Electoral rolls have become difficult to download in bulk( a good
> >> thing!) but i understand different people on this group have the pdfs
> >> for different states. Putting this stuff together should give us
> >> comprehensive data on header pages for atleast some states.
> >> Alternatively, we can file RTIs for just the header pages of
> electoral
> >> rolls, though i dont know how successful that would be.
> >>
> >> * Raphael's 

Re: [datameet] Pincode Boundaries of India

2016-04-02 Thread Devdatta Tengshe
Hi Raphael,

Firstly, thanks a lot for extracting this information.

I was looking at http://dx.doi.org/10.4119/unibi/2674065, but I could find
only the Boundaries for the constituencies.

Can you tell us where we can find the locations of the polling booths that
you had extracted?

Secondly, can you also share (if you still have them) the heatmaps code
that you used to create the constituency boundaries? I think that is what
will be required to create the pincode boundaries as well.

Regards,
Dev

Regards,
Devdatta

On Fri, Apr 1, 2016 at 6:31 PM, Raphael Susewind 
wrote:

> Dear all,
>
> following up on my earlier email, I just pushed a list of pincodes for
> all electoral booths across India to GitHub and made a pull request to
> the datameet repository:
>
> https://github.com/datameet/pincodes/pull/2
>
> Please note that this can be incomplete, and is based on a rather
> brutish, quick and dirty hack - see comments in rolls2pincode.pl. But it
> does use the same IDs as those in the 2014 elections, and hence can be
> combined with my GIS shapefiles for polling booths:
>
> http://dx.doi.org/10.4119/unibi/2674065
>
> I leave it to others to double-check accuracy and create actual pincode
> maps. I hope this is useful,
>
> Best,
> Raphael
>
> On 28.03.2016 07:50, Raphael Susewind wrote:
>
> > Dear Avinash and all,
> >
> > I will try to make some time this week to scrape the pincodes from
> > electoral rolls for all polling booths in my electoral GIS shapefiles.
> >
> > Since pincode is in latin script, this should not be affected by the
> > much discussed PDF scraping issues with electoral rolls.
> >
> > We could then either go down the voronoi route, or alternatively use the
> > heatmap processing chain that I used to generate AC boundaries - this
> > latter would have the advantage of dealing with wrong coordinates in the
> > booth point dataset (basically, not all electoral booth coordinates are
> > correct; consequently, if we only voronoi, we would have a blip of
> > pincode B within a see of pincode A quite frequently. The heatmap stuff
> > takes care of this).
> >
> > Since I am not familiar with postal boundaries: can anyone here confirm
> > whether pincode areas are contiguous, and whether each pincode has only
> > one area? Or can it be that several non-contiguous areas have the same
> > pincodem intersparsed with other pincodes? (In which case voronoi would
> > perhaps be the better solution at last)
> >
> > In any case, I hope to give you the pincode for each polling booth by
> > end of the week or so (based on all-India 2014 electoral rolls),
> >
> > Best,
> > Raphael
> >
> > On 28.03.2016 06:33, Avinash Celestine wrote:
> >
> >> perhaps one way is to avoid using postal data altogether.
> >>
> >> All header pages in electoral rolls(the first page) contain the name of
> >> the polling station related to that roll, the PS number, and importantly
> >> the pin code.
> >>
> >>  A site like psleci.nic.in  has geog coordinates
> >> of polling stations (though Raphael had collected the data earlier*).
> >> Matching the two will give a fairly dense scattering of points  - in
> >> fact much more dense than if we used some of the methods earlier in this
> >> thread.
> >>
> >> We thus have a way of associating a pin code with a geo coordinate. We
> >> can then use the voronoi method.
> >>
> >> Electoral rolls are mostly in pdf which make them difficult to scrape.
> >> But from what i have seen, for any given state, the location on the
> >> header page, of the pincode number is more or less constant, making it
> >> possible to target just that part of the page with any pdf parser.
> >>
> >> Electoral rolls have become difficult to download in bulk( a good
> >> thing!) but i understand different people on this group have the pdfs
> >> for different states. Putting this stuff together should give us
> >> comprehensive data on header pages for atleast some states.
> >> Alternatively, we can file RTIs for just the header pages of electoral
> >> rolls, though i dont know how successful that would be.
> >>
> >> * Raphael's data is
> >> at https://github.com/raphael-susewind/india-election-data
> >>
> >>
> >>
> >> On Sun, Mar 27, 2016 at 12:07 PM, srinivas kodali <
> iota.kod...@gmail.com
> >> > wrote:
> >>
> >> Well, There were postal delivery zones in the past and the postal
> >> department even used to make maps of these zones. The Delhi postal
> >> delivery zone map
> >> <
> https://drive.google.com/file/d/0B1RcWLku0ZOWWVBHMldrZWdfZEU/view?usp=sharing>
> had
> >> boundaries for delhi. I am not sure if other cities had them or how
> >> long the postal department was doing this, but it certainly can help
> >> with the boundaries for cities.
> >>
> >> Regards,
> >> Srinivas Kodali
> >> www.lostprogrammer.com 
> >> /"Not everyone who wanders is lost, I am probably a bit"/
> >>
> >> On