[datameet] [old school?] random sampling vs big data - which is better?

2019-11-05 Thread Avinash Celestine
v. interesting paper which asks the question:

What should I trust more: a 1% survey with 60% response rate or a
*self-reported* administrative dataset covering 80% of the population?

https://statistics.fas.harvard.edu/files/statistics-2/files/statistical_paradises_and_paradoxes.pdf

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/CALPXTXp33A6%3DrbC%3DBkQAS7YsugPU4g-dSkgCva6skwe4kcso2A%40mail.gmail.com.


Re: [datameet] Parliamentary constituency boundaries 2019

2019-03-18 Thread Avinash Celestine
Constituency boundaries were last delimited in 2008, and have not changed
since.

On Sat, Mar 16, 2019 at 12:06 AM Arun Ganesh  wrote:

> With the upcoming elections, this would be a hot dataset that everyone
> will be looking for. The best available dataset on the web right now is on
> the datameet repository
> 
> updated during the previous elections in 2014.
>
> Does anyone know if there have been changes in the constituency boundaries
> since 2014? Also the existing boundaries are fairly generalized resulting
> in an accuracy of around a km.
>
> See this comparison for Bengaluru: 1) PC shapes from datameet 2) AC shapes
> from datameet 3) PC shapes from Karnataka KSRAC
> [image: new1.gif]
>
> The KSRAC boundaries was queried from their geoserver
> 
>  and
> are super accurate upto the street level, but is limited to only Karnataka.
> Does anyone know how we can source this for the entire country?
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Re: India GIS Data Update/ Questions

2017-08-28 Thread Avinash Celestine
enclosing a mapping of 2001-2011 village / town codes. these files were
retrieved from http://egovstandards.gov.in/code-directories-of-
generic-data-elements-for-land-region-codification (link at bottom), quite
some time back, though that link no longer works.

the big 'missing link' here is urban ward to ward mapping across 2001-2011

the zip file is at
https://github.com/avinashcelestine/village_town_mapping_2001_2011


On Mon, Aug 28, 2017 at 1:00 PM, Avinash Celestine <
avinash.celest...@gmail.com> wrote:

> enclosing a mapping of 2001-2011 village / town codes. these files were
> retrieved from http://egovstandards.gov.in/code-directories-of-
> generic-data-elements-for-land-region-codification (link at bottom),
> quite some time back, though that link no longer works.
>
> the big 'missing link' here is urban ward to ward mapping across 2001-2011
>
>
>
> On Fri, Aug 25, 2017 at 3:16 PM, J M <justinelliotmey...@gmail.com> wrote:
>
>> Yes, by chance did you see I uploaded the entire India Village data set?
>> That is by far the most detailed data said I have seen out of India besides
>> this data set. Let me know what you think about that area or if you have
>> any specific questions or concerns thanks
>>
>> On Aug 24, 2017 11:27 PM, "Nikhil VJ" <nikhil...@gmail.com> wrote:
>>
>>> Yep, this is good, and the license text from MRSAC being there is
>>> significant. There is a column to differentiate between village, gap
>>> etc which may also be very helpful. It's only having census 2001 code
>>> though, so will have to work on matching with 2011 codes.
>>>
>>> Thanks Justin, will look into this in coming days (other folks in MH
>>> and into mapping, join in!).
>>> The totals will be important to compare across this and other datasets
>>> like census data.
>>>
>>> Would you have similar data for Madhya Pradesh too? (checked your
>>> github, didn't find it there)
>>>
>>>
>>> On 8/25/17, Justin <justinelliotmey...@gmail.com> wrote:
>>> > Maharashtra:
>>> > https://github.com/justinelliotmeyers/official_maharshtra_in
>>> dia_village_boundary_shapefile
>>> >
>>> > This is the highest quality data I have seen for here
>>> >
>>> > --
>>> > Datameet is a community of Data Science enthusiasts in India. Know more
>>> > about us by visiting http://datameet.org
>>> > ---
>>> > You received this message because you are subscribed to the Google
>>> Groups
>>> > "datameet" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> an
>>> > email to datameet+unsubscr...@googlegroups.com.
>>> > For more options, visit https://groups.google.com/d/optout.
>>> >
>>>
>>>
>>> --
>>> --
>>> Cheers,
>>> Nikhil VJ
>>> +91-966-583-1250
>>> Pune / Mandangad, India
>>> DataMeet Pune chapter <https://datameet-pune.github.io/>
>>> Self-designed learner at Swaraj University <
>>> http://www.swarajuniversity.org>
>>> Blog <http://nikhilsheth.blogspot.in>
>>>
>>> --
>>> Datameet is a community of Data Science enthusiasts in India. Know more
>>> about us by visiting http://datameet.org
>>> ---
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "datameet" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>> pic/datameet/YzZGG1c2jrQ/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> datameet+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> Datameet is a community of Data Science enthusiasts in India. Know more
>> about us by visiting http://datameet.org
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to datameet+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Pincode Boundaries of India

2017-03-01 Thread Avinash Celestine
Hi Veena

 the gram panchayat names are given in the village census district
handbooks. However, they are also available at lgdirectory.gov.in >>
Download Directory >> gram panchayat mapping to village


Avinash

On Wed, Mar 1, 2017 at 2:46 PM, Veena Ramanna <ve...@indiagoverns.org>
wrote:

> Dear Avinash,
>
> I was looking at the pincodes_censuscodes.zip
> <https://github.com/avinashcelestine/Pincodes-data/blob/master/pincodes_censuscodes.zip>
>  in the github at the location you have specified. This is helpful.
>
> In addition to the pin codes, you have also been able to get gram
> panchayat names. How have you managed that? Could you please tell me.
>
> regards,
> Veena
>
> On 4 April 2016 at 12:44, Avinash Celestine <avinash.celest...@gmail.com>
> wrote:
>
>> Adding some relevant data.
>>
>> The district handbooks released by census provide pincode details for
>> village areas. for each village / location code, they provide the relevant
>> pincode which covers that village. I have collated that data from the
>> handbook files and put them up here:
>>
>> https://github.com/avinashcelestine/pincodes_censuscodes
>>
>> Unfortunately, the district handbooks only provide such pincode data for
>> villages, not towns or wards. And even within villages, a number of
>> locations dont have the relevant pincode info (i think a large part of
>> Madhya Pradesh is blank for instance). Another bunch of villages have
>> pincodes in less than 6 digits. Despite these issues however, there are
>> still 4.13 lakh villages out of a total of 6.4 lakh, for whom six digit
>> pincodes are given.
>>
>> the handbooks are here:
>> http://www.censusindia.gov.in/2011census/dchb/DCHB.html
>>
>> (For the sake of completeness, i have not removed entries for those
>> villages for which pincodes are not given)
>>
>> These will be useful for anyone looking to map census data to pincodes.
>>
>> regards
>>
>> Avinash
>>
>> On Sun, Apr 3, 2016 at 6:47 PM, Dilip Damle <cadvis...@gmail.com> wrote:
>>
>>> Hi Devdatta,
>>>
>>> What is the data that you are looking for.
>>> I had saved it.
>>> But it is huge. About 1.55 GB consisting of about 135 shapefiles out of
>>> which 70 are raw booths.
>>>
>>>
>>>
>>> On Saturday, April 2, 2016 at 12:32:47 PM UTC+5:30, Devdatta Tengshe
>>> wrote:
>>>>
>>>> Hi Raphael,
>>>>
>>>> Firstly, thanks a lot for extracting this information.
>>>>
>>>> I was looking at http://dx.doi.org/10.4119/unibi/2674065, but I could
>>>> find only the Boundaries for the constituencies.
>>>>
>>>> Can you tell us where we can find the locations of the polling booths
>>>> that you had extracted?
>>>>
>>>> Secondly, can you also share (if you still have them) the heatmaps code
>>>> that you used to create the constituency boundaries? I think that is what
>>>> will be required to create the pincode boundaries as well.
>>>>
>>>> Regards,
>>>> Dev
>>>>
>>>> Regards,
>>>> Devdatta
>>>>
>>>> On Fri, Apr 1, 2016 at 6:31 PM, Raphael Susewind <
>>>> li...@raphael-susewind.de> wrote:
>>>>
>>>>> Dear all,
>>>>>
>>>>> following up on my earlier email, I just pushed a list of pincodes for
>>>>> all electoral booths across India to GitHub and made a pull request to
>>>>> the datameet repository:
>>>>>
>>>>> https://github.com/datameet/pincodes/pull/2
>>>>>
>>>>> Please note that this can be incomplete, and is based on a rather
>>>>> brutish, quick and dirty hack - see comments in rolls2pincode.pl. But
>>>>> it
>>>>> does use the same IDs as those in the 2014 elections, and hence can be
>>>>> combined with my GIS shapefiles for polling booths:
>>>>>
>>>>> http://dx.doi.org/10.4119/unibi/2674065
>>>>>
>>>>> I leave it to others to double-check accuracy and create actual pincode
>>>>> maps. I hope this is useful,
>>>>>
>>>>> Best,
>>>>> Raphael
>>>>>
>>>>> On 28.03.2016 07:50, Raphael Susewind wrote:
>>>>>
>>>>> > Dear Avinash and all,
>>>>> >
>>>>> > I will try to make some time

Re: [datameet] Re: What do you people think about the new proposed Bill about Maps and Geospatial data

2016-05-09 Thread Avinash Celestine
Also, i guess its worthwhile pointing out that if the primary aim is to
prevent inaccurate depiction of India's borders and impose penalties on
those who do, there is already existing legislation which does that.

https://indiankanoon.org/doc/496978/

Plus, the IT act too penalises wrong depiction of boundaries.

it is worth looking at this piece from back in 2007, which talked about the
possibility of such legislation:

http://www.livemint.com/Politics/IRguUu8B1fWHgS8ej3rI5M/Showing-incorrect-map-could-become-a-crime.html



On Mon, May 9, 2016 at 12:21 PM, shirish शिरीष 
wrote:

> Hi all,
>
> As a layman user of maps, did my 2-bit.
>
>
> https://flossexperiences.wordpress.com/2016/05/09/using-gps-without-license-jail-for-7-years/
>
> People are welcome to critique it (I know it goes all over the place
> and have added govt. conspiracy which some people might not be
> comfortable with) .
>
> Do have suggestion that if people do make blog posts that the least we
> can do is do ping and linkback to each other, if for nothing but to
> show solidarity in the cause.
>
> My 2 paise :)
>
> --
>   Regards,
>   Shirish Agarwal  शिरीष अग्रवाल
>   My quotes in this email licensed under CC 3.0
> http://creativecommons.org/licenses/by-nc/3.0/
> http://flossexperiences.wordpress.com
> EB80 462B 08E1 A0DE A73A  2C2F 9F3D C7A4 E1C4 D2D8
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] update to pincodes data

2016-04-30 Thread Avinash Celestine
Hi

Came across this site a week or so ago : http://pmjdy.gov.in/g-i-s.aspx.

This site, which is part of the PMJDY (jan dhan yojana), has geolocations
for post offices among others. There are 1.42 lakh post office locations
given in this map. I have tried to match these locations to actual pincodes
from the list of post offices given in data.gov.in.

I managed to match about 1.35 lakh. About 7k-8k are unmatched.

So we have 1.35 lakh post office locations with associated geocodes and
pincodes. Another 7k-8k can be done with a little effort. still working on
it.

The data is here : https://github.com/avinashcelestine/Pincodes-data

regards
Avinash

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Pincode Boundaries of India

2016-04-04 Thread Avinash Celestine
Adding some relevant data.

The district handbooks released by census provide pincode details for
village areas. for each village / location code, they provide the relevant
pincode which covers that village. I have collated that data from the
handbook files and put them up here:

https://github.com/avinashcelestine/pincodes_censuscodes

Unfortunately, the district handbooks only provide such pincode data for
villages, not towns or wards. And even within villages, a number of
locations dont have the relevant pincode info (i think a large part of
Madhya Pradesh is blank for instance). Another bunch of villages have
pincodes in less than 6 digits. Despite these issues however, there are
still 4.13 lakh villages out of a total of 6.4 lakh, for whom six digit
pincodes are given.

the handbooks are here:
http://www.censusindia.gov.in/2011census/dchb/DCHB.html

(For the sake of completeness, i have not removed entries for those
villages for which pincodes are not given)

These will be useful for anyone looking to map census data to pincodes.

regards

Avinash

On Sun, Apr 3, 2016 at 6:47 PM, Dilip Damle <cadvis...@gmail.com> wrote:

> Hi Devdatta,
>
> What is the data that you are looking for.
> I had saved it.
> But it is huge. About 1.55 GB consisting of about 135 shapefiles out of
> which 70 are raw booths.
>
>
>
> On Saturday, April 2, 2016 at 12:32:47 PM UTC+5:30, Devdatta Tengshe wrote:
>>
>> Hi Raphael,
>>
>> Firstly, thanks a lot for extracting this information.
>>
>> I was looking at http://dx.doi.org/10.4119/unibi/2674065, but I could
>> find only the Boundaries for the constituencies.
>>
>> Can you tell us where we can find the locations of the polling booths
>> that you had extracted?
>>
>> Secondly, can you also share (if you still have them) the heatmaps code
>> that you used to create the constituency boundaries? I think that is what
>> will be required to create the pincode boundaries as well.
>>
>> Regards,
>> Dev
>>
>> Regards,
>> Devdatta
>>
>> On Fri, Apr 1, 2016 at 6:31 PM, Raphael Susewind <
>> li...@raphael-susewind.de> wrote:
>>
>>> Dear all,
>>>
>>> following up on my earlier email, I just pushed a list of pincodes for
>>> all electoral booths across India to GitHub and made a pull request to
>>> the datameet repository:
>>>
>>> https://github.com/datameet/pincodes/pull/2
>>>
>>> Please note that this can be incomplete, and is based on a rather
>>> brutish, quick and dirty hack - see comments in rolls2pincode.pl. But it
>>> does use the same IDs as those in the 2014 elections, and hence can be
>>> combined with my GIS shapefiles for polling booths:
>>>
>>> http://dx.doi.org/10.4119/unibi/2674065
>>>
>>> I leave it to others to double-check accuracy and create actual pincode
>>> maps. I hope this is useful,
>>>
>>> Best,
>>> Raphael
>>>
>>> On 28.03.2016 07:50, Raphael Susewind wrote:
>>>
>>> > Dear Avinash and all,
>>> >
>>> > I will try to make some time this week to scrape the pincodes from
>>> > electoral rolls for all polling booths in my electoral GIS shapefiles.
>>> >
>>> > Since pincode is in latin script, this should not be affected by the
>>> > much discussed PDF scraping issues with electoral rolls.
>>> >
>>> > We could then either go down the voronoi route, or alternatively use
>>> the
>>> > heatmap processing chain that I used to generate AC boundaries - this
>>> > latter would have the advantage of dealing with wrong coordinates in
>>> the
>>> > booth point dataset (basically, not all electoral booth coordinates are
>>> > correct; consequently, if we only voronoi, we would have a blip of
>>> > pincode B within a see of pincode A quite frequently. The heatmap stuff
>>> > takes care of this).
>>> >
>>> > Since I am not familiar with postal boundaries: can anyone here confirm
>>> > whether pincode areas are contiguous, and whether each pincode has only
>>> > one area? Or can it be that several non-contiguous areas have the same
>>> > pincodem intersparsed with other pincodes? (In which case voronoi would
>>> > perhaps be the better solution at last)
>>> >
>>> > In any case, I hope to give you the pincode for each polling booth by
>>> > end of the week or so (based on all-India 2014 electoral rolls),
>>> >
>>> > Best,
>>> > Raphael
>>> >
>>

Re: [datameet] Pincode Boundaries of India

2016-04-01 Thread Avinash Celestine
Thanks v much Raphael. This is great.

On Friday 1 April 2016, Raphael Susewind <li...@raphael-susewind.de> wrote:

> Dear all,
>
> following up on my earlier email, I just pushed a list of pincodes for
> all electoral booths across India to GitHub and made a pull request to
> the datameet repository:
>
> https://github.com/datameet/pincodes/pull/2
>
> Please note that this can be incomplete, and is based on a rather
> brutish, quick and dirty hack - see comments in rolls2pincode.pl. But it
> does use the same IDs as those in the 2014 elections, and hence can be
> combined with my GIS shapefiles for polling booths:
>
> http://dx.doi.org/10.4119/unibi/2674065
>
> I leave it to others to double-check accuracy and create actual pincode
> maps. I hope this is useful,
>
> Best,
> Raphael
>
> On 28.03.2016 07:50, Raphael Susewind wrote:
>
> > Dear Avinash and all,
> >
> > I will try to make some time this week to scrape the pincodes from
> > electoral rolls for all polling booths in my electoral GIS shapefiles.
> >
> > Since pincode is in latin script, this should not be affected by the
> > much discussed PDF scraping issues with electoral rolls.
> >
> > We could then either go down the voronoi route, or alternatively use the
> > heatmap processing chain that I used to generate AC boundaries - this
> > latter would have the advantage of dealing with wrong coordinates in the
> > booth point dataset (basically, not all electoral booth coordinates are
> > correct; consequently, if we only voronoi, we would have a blip of
> > pincode B within a see of pincode A quite frequently. The heatmap stuff
> > takes care of this).
> >
> > Since I am not familiar with postal boundaries: can anyone here confirm
> > whether pincode areas are contiguous, and whether each pincode has only
> > one area? Or can it be that several non-contiguous areas have the same
> > pincodem intersparsed with other pincodes? (In which case voronoi would
> > perhaps be the better solution at last)
> >
> > In any case, I hope to give you the pincode for each polling booth by
> > end of the week or so (based on all-India 2014 electoral rolls),
> >
> > Best,
> > Raphael
> >
> > On 28.03.2016 06:33, Avinash Celestine wrote:
> >
> >> perhaps one way is to avoid using postal data altogether.
> >>
> >> All header pages in electoral rolls(the first page) contain the name of
> >> the polling station related to that roll, the PS number, and importantly
> >> the pin code.
> >>
> >>  A site like psleci.nic.in <http://psleci.nic.in> has geog coordinates
> >> of polling stations (though Raphael had collected the data earlier*).
> >> Matching the two will give a fairly dense scattering of points  - in
> >> fact much more dense than if we used some of the methods earlier in this
> >> thread.
> >>
> >> We thus have a way of associating a pin code with a geo coordinate. We
> >> can then use the voronoi method.
> >>
> >> Electoral rolls are mostly in pdf which make them difficult to scrape.
> >> But from what i have seen, for any given state, the location on the
> >> header page, of the pincode number is more or less constant, making it
> >> possible to target just that part of the page with any pdf parser.
> >>
> >> Electoral rolls have become difficult to download in bulk( a good
> >> thing!) but i understand different people on this group have the pdfs
> >> for different states. Putting this stuff together should give us
> >> comprehensive data on header pages for atleast some states.
> >> Alternatively, we can file RTIs for just the header pages of electoral
> >> rolls, though i dont know how successful that would be.
> >>
> >> * Raphael's data is
> >> at https://github.com/raphael-susewind/india-election-data
> >>
> >>
> >>
> >> On Sun, Mar 27, 2016 at 12:07 PM, srinivas kodali <
> iota.kod...@gmail.com <javascript:;>
> >> <mailto:iota.kod...@gmail.com <javascript:;>>> wrote:
> >>
> >> Well, There were postal delivery zones in the past and the postal
> >> department even used to make maps of these zones. The Delhi postal
> >> delivery zone map
> >> <
> https://drive.google.com/file/d/0B1RcWLku0ZOWWVBHMldrZWdfZEU/view?usp=sharing>
> had
> >> boundaries for delhi. I am not sure if other cities had them or how
> >> long the postal department was doing this, but it certainly can hel

Re: [datameet] Pincode Boundaries of India

2016-03-27 Thread Avinash Celestine
perhaps one way is to avoid using postal data altogether.

All header pages in electoral rolls(the first page) contain the name of the
polling station related to that roll, the PS number, and importantly the
pin code.

 A site like psleci.nic.in has geog coordinates of polling stations (though
Raphael had collected the data earlier*). Matching the two will give a
fairly dense scattering of points  - in fact much more dense than if we
used some of the methods earlier in this thread.

We thus have a way of associating a pin code with a geo coordinate. We can
then use the voronoi method.

Electoral rolls are mostly in pdf which make them difficult to scrape. But
from what i have seen, for any given state, the location on the header
page, of the pincode number is more or less constant, making it possible to
target just that part of the page with any pdf parser.

Electoral rolls have become difficult to download in bulk( a good thing!)
but i understand different people on this group have the pdfs for different
states. Putting this stuff together should give us comprehensive data on
header pages for atleast some states. Alternatively, we can file RTIs for
just the header pages of electoral rolls, though i dont know how successful
that would be.

* Raphael's data is at
https://github.com/raphael-susewind/india-election-data



On Sun, Mar 27, 2016 at 12:07 PM, srinivas kodali 
wrote:

> Well, There were postal delivery zones in the past and the postal
> department even used to make maps of these zones. The Delhi postal
> delivery zone map
> 
>  had
> boundaries for delhi. I am not sure if other cities had them or how long
> the postal department was doing this, but it certainly can help with the
> boundaries for cities.
>
> Regards,
> Srinivas Kodali
> www.lostprogrammer.com
> *"Not everyone who wanders is lost, I am probably a bit"*
>
> On Tue, Mar 22, 2016 at 9:29 PM, Arun Ganesh  wrote:
>
>> Shravan, crowdsourcing the boundaries of pincodes is not as trivial as
>> you think. To start with, an area does not fall under a pincode, rather a
>> street does based on the post office that services it. Read this:
>> http://www.georeference.org/doc/zip_codes_are_not_areas.htm
>>
>> You may also want to do some background reading of existing research that
>> has been done by the group here:
>> https://datameet.hackpad.com/M4hPFJVV2Gm?eid=v4YoXN4tTw5
>>
>> To sum up, nobody has precise pincode boundaries like how you imagine
>> them, not even the postal department. Any existing datasets are an estimate
>> at best using some data processing on a large volume of address data.
>>
>> --
>> Datameet is a community of Data Science enthusiasts in India. Know more
>> about us by visiting http://datameet.org
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to datameet+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Form 20 for Mumbai

2014-11-07 Thread Avinash Celestine
no unfortunately not. my impression is that maharashtra is slower than
putting these out than some other states.

A
On Nov 7, 2014 2:09 PM, Raphael Susewind li...@raphael-susewind.de
wrote:

 Hi Avinash,

 Thanks - I did not check all PDFs systematically, should have done...

 Any idea when and/or whether assembly form 20 will be available?

 Best,
 Raphael

 On 07.11.2014 08:52, Avinash Celestine wrote:
  I should mention that these are for the parliamentary elections of May
  2014, not the recent assembly elections.
 
  A
 
  On Fri, Nov 7, 2014 at 1:20 PM, Avinash Celestine
  avinash.celest...@gmail.com mailto:avinash.celest...@gmail.com
 wrote:
 
  Hi Raphael,
 
  some of those links are dead, but not all. seems not all form 20s
  for each constituency have been uploaded yet. I have the ones for
  which it is (downloaded sometime back) ...attached. as far as mumbai
  is concerned, I think the south mumbai data is not there
  yet...ignore the pdf files which are too small in size (2KB etc).
  Those are the ones for which the links were dead.
 
 
 
 
  ​
   mhGE2014-incomplete.zip
  
 https://docs.google.com/file/d/0BxAgA1sHG2dMcDVrWFd1Vkozb2M/edit?usp=drive_web
 
  ​
 
  On Fri, Nov 7, 2014 at 12:09 PM, Raphael Susewind
  li...@raphael-susewind.de mailto:li...@raphael-susewind.de
 wrote:
 
  Dear all,
 
  does anyone have access to booth-level results for Maharashtra,
  especially Mumbai, both general and assembly elections? Or any
  information as to whether and when it might be available? On the
 CEO
  website, one finds links to general election form 20, but those
  links
  are dead. No mention of assembly data (either now or earlier):
 
  https://www.ceo.maharashtra.gov.in/Results/Form20.aspx
 
  Any hint appreciated,
  Raphael
 
  --
  Raphael Susewind | BGHS Bielefeld University, CSASP University
  of Oxford
Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
 Web  Twitter | http://www.raphael-susewind.de |
 @RaphaelSusewind
 
  Please do consider http://www.gnupg.org for encryption (key id
  10AEE42F)
 
  --
  Datameet is a community of Data Science enthusiasts in India.
  Know more about us by visiting http://datameet.org
  ---
  You received this message because you are subscribed to the
  Google Groups datameet group.
  To unsubscribe from this group and stop receiving emails from
  it, send an email to datameet+unsubscr...@googlegroups.com
  mailto:datameet%2bunsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.
 
 
 
  --
  Datameet is a community of Data Science enthusiasts in India. Know more
  about us by visiting http://datameet.org
  ---
  You received this message because you are subscribed to the Google
  Groups datameet group.
  To unsubscribe from this group and stop receiving emails from it, send
  an email to datameet+unsubscr...@googlegroups.com
  mailto:datameet+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.

 --
 Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
   Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
Web  Twitter | http://www.raphael-susewind.de | @RaphaelSusewind

 Please do consider http://www.gnupg.org for encryption (key id 10AEE42F)

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Form 20 for Mumbai

2014-11-06 Thread Avinash Celestine
Hi Raphael,

some of those links are dead, but not all. seems not all form 20s for each
constituency have been uploaded yet. I have the ones for which it is
(downloaded sometime back) ...attached. as far as mumbai is concerned, I
think the south mumbai data is not there yet...ignore the pdf files which
are too small in size (2KB etc). Those are the ones for which the links
were dead.




​
 mhGE2014-incomplete.zip
https://docs.google.com/file/d/0BxAgA1sHG2dMcDVrWFd1Vkozb2M/edit?usp=drive_web
​

On Fri, Nov 7, 2014 at 12:09 PM, Raphael Susewind li...@raphael-susewind.de
 wrote:

 Dear all,

 does anyone have access to booth-level results for Maharashtra,
 especially Mumbai, both general and assembly elections? Or any
 information as to whether and when it might be available? On the CEO
 website, one finds links to general election form 20, but those links
 are dead. No mention of assembly data (either now or earlier):

 https://www.ceo.maharashtra.gov.in/Results/Form20.aspx

 Any hint appreciated,
 Raphael

 --
 Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
   Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
Web  Twitter | http://www.raphael-susewind.de | @RaphaelSusewind

 Please do consider http://www.gnupg.org for encryption (key id 10AEE42F)

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] fyi ...EC GIS maps for assembly constituencies

2014-10-15 Thread Avinash Celestine
Apologies if this has already come up in the past on this group, but the
EC's polling station locator site (psleci.nic.in), now has kml files for
ACs for a number of states. the links are in the source of main page (zip
files).

however, not all states are there. states such as Tamil Nadu, JK, Kerala,
Nagaland etc are not there. files for majority of states are available.

I have not explored all the files. only maharashtra. it opens fine in
something like fusion tables, but in a software like qgis, all the
attribute data gets stripped out.




regards
Avinash

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Security in Indian Railways Websites

2014-08-14 Thread Avinash Celestine
lol. and an earlier email to this group had pointed out that another part
of the railway site had the captcha code hardcoded into the html source...






On Fri, Aug 15, 2014 at 5:18 AM, srinivas kodali iota.kod...@gmail.com
wrote:

 Somebody needs to help the IT team at Centre for Railway Information
 Systems (CRIS) learn what captcha is

 check the implementation of captcha in FNR Enquiry

 http://www.fois.indianrail.gov.in/

 Regards,
 Srinivas

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] temperature data for cities - historical

2014-06-19 Thread Avinash Celestine
Does anyone know of a source of daily temperature data (max,min) for cities
which goes back a few years?

Most of the daily data only goes back a couple of months, and those which
are a couple of years old are monthly averages.

regards

Avinash

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Re: temperature data for cities - historical

2014-06-19 Thread Avinash Celestine
thanks. Yes, I was looking for daily data.

rgds

A


On Thu, Jun 19, 2014 at 3:20 PM, Pratap Vardhan pratap...@gmail.com wrote:

 Not quite what you'd want. Here
 http://www.imd.gov.in/section/nhac/mean/110_new.htm you would
 find Monthly Mean Maximum  Minimum temperature and monthly total rainfall
 of important stations for the period 1901-2000


 On Thursday, June 19, 2014 3:13:10 PM UTC+5:30, Avinash Celestine wrote:

 Does anyone know of a source of daily temperature data (max,min) for
 cities which goes back a few years?

 Most of the daily data only goes back a couple of months, and those which
 are a couple of years old are monthly averages.

 regards

 Avinash

  --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Re: temperature data for cities - historical

2014-06-19 Thread Avinash Celestine
thank you very much for all the links.

I discovered the following source. DOesnt seem to be official though. Has
anyone explored it?

http://www.tutiempo.net/en/Climate/New_Delhi_Safdarjung/01-2013/421820.htm
(delhi)
http://www.tutiempo.net/en/Climate/BANGALORE/04-2010/432950.htm (bangalore)

(in each case, i picked random year, month)

It seems fairly comprehensive. Here is the list of cities covered.
http://www.tutiempo.net/en/Climate/India/IN.html

Avinash



On Thu, Jun 19, 2014 at 3:52 PM, Anand Chitipothu anandol...@gmail.com
wrote:

 Hi,

 I have archive of the data from 2012.

 http://anandology.com/tmp/data/www.imdaws.com/archive/

 Thejesh recently started working on a new scrapper that exports the data
 to CSV and other formats.

 https://github.com/thejeshgn/imd

 Anand



 On Thu, Jun 19, 2014 at 3:36 PM, Srinivasan Ramani 
 srinivasan...@gmail.com wrote:

 The National Data Centre of the IMD has daily data from 1951 onwards till
 present.

 http://www.imdpune.gov.in/research/ndc/ndc_index.html

 Don't think if they have hosted this online, but I suppose you can
 contact them to provide the dataset for your purposes.


 On Thu, Jun 19, 2014 at 3:21 PM, Avinash Celestine 
 avinash.celest...@gmail.com wrote:

 thanks. Yes, I was looking for daily data.

 rgds

 A


 On Thu, Jun 19, 2014 at 3:20 PM, Pratap Vardhan pratap...@gmail.com
 wrote:

 Not quite what you'd want. Here
 http://www.imd.gov.in/section/nhac/mean/110_new.htm you would
 find Monthly Mean Maximum  Minimum temperature and monthly total rainfall
 of important stations for the period 1901-2000


 On Thursday, June 19, 2014 3:13:10 PM UTC+5:30, Avinash Celestine wrote:

 Does anyone know of a source of daily temperature data (max,min) for
 cities which goes back a few years?

 Most of the daily data only goes back a couple of months, and those
 which are a couple of years old are monthly averages.

 regards

 Avinash

  --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google
 Groups datameet group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google
 Groups datameet group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.




 --
 Best Regards,
 Srinivasan V. Ramani ,
 Senior Assistant Editor,
 Economic and Political Weekly ,
 New Delhi: 110 067
 09650855669

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.




 --
 Anand
 http://anandology.com/

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Booth-wise elector count (male/female/other/total)

2014-06-07 Thread Avinash Celestine
great. thanks

Have the form 20s for UP been put out? I know Bihar and Bengal are out...

Avinash


On Sat, Jun 7, 2014 at 11:56 AM, Raphael Susewind li...@raphael-susewind.de
 wrote:

 Dear all,

 now that Form20 results start to come out, some of you might be
 interested in booth-wise elector count to be able to calculate
 fine-grained turnout rates. They are not contained in Form20, but
 available in the electoral rolls; as a side effect of my ongoing
 academic work, I have extracted these.

 Here is my pull request to the datameet github:

 https://github.com/datameet/india-election-data/pull/8

 Note that this is based on a quick-hack automated extraction, so no
 guarantees. Also, some states and UTs are missing, notably:

 Uttarakhand - PDF rolls not available
 Chhattisgarh - PDFs rolls behind captcha
 Lakshadweep - problem with parsing
 Chandigarh - problem with parsing

 I hope this is useful to some,

 Best,
 Raphael

 --
 Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
   Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
Web  Twitter | http://www.raphael-susewind.de | @RaphaelSusewind

 Please do consider http://www.gnupg.org for encryption (key id 10AEE42F)

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] fuzzy matching text in Excel

2014-05-22 Thread Avinash Celestine
Thought i would share this link. It gives the code for a fuzzy match
version of Excel's vlookup function. I've used it for a couple of years
actually, and it works quite well. Extremely useful when trying to match
names, places etc.

http://www.mrexcel.com/forum/excel-questions/195635-fuzzy-matching-new-version-plus-explanation.html#post955137

Avinash

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Re: Security Issues with the Voter List

2014-05-21 Thread Avinash Celestine
One more data point to our discussion on data privacy in indian elections,
though from a slightly different perspective.

EC has told supreme court that it is against making polling station level
voting data public

http://www.financialexpress.com/news/ec-against-wardwise-counting-of-votes-supreme-court-told/1253124

Rgds

Avinash





On 19-May-2014, at 12:09 pm, Dilip Damle cadvis...@gmail.com wrote:

HI

YES,

I think the way the Data access is provided gives transparency but it can
be misused.
I had downloaded Goa and Delhi pdfs several years back.

Then explained to someone on a social network how he/she can be tracked and
Stalked.
PIPL.com can help you get complete name even if your name is hidden on some
networks.
MTNL/BSNL Phone directory can get your number
Voter pdfs can give your address

and this can be done on a mass scale.

My opinion is they should make pdfs after Rasterising the pages in a kind
of Odd and jaggered font
So that they are readable by Humans but not easily by Machines

Rgds
Dilip Damle



On Friday, April 11, 2014 9:55:03 AM UTC+5:30, Devdatta Tengshe wrote:

 Hi,
 I found this interesting article by a guy who downloaded and processed the
 Voter list of Delhi: https://medium.com/p/1aff55526881

 I found this via a discussion on Reddit:
 http://www.reddit.com/r/programming/comments/22pn8u/i_wrote_a_few_simple_python_scripts_to_retrieve/

 I'll like to quote his findings here:


1. It is possible to automate the retrieval of every single PDF roll
all across India
2. These PDFs can then be processed in a matter of minutes to produce
details like Addresses, names, father’s name, gender, age and voters ID
number for every single registered voter of India
3. Nearly 25% of the Voter IDs assigned within only Delhi fail to
conform to the government format, and fail the Luhn Checksum test used to
validate them. It is likely that other states are in a similar, if not
worse condition


 Regards,

 Devdatta Tengshe

  --
Datameet is a community of Data Science enthusiasts in India. Know more
about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an
email to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] the datameet districts shapefile with 2011 census district codes added

2014-05-21 Thread Avinash Celestine
sure. attached

also, for a cross check, see any of the excels on this link - all the
district level data will have codes attached

http://www.censusindia.gov.in/2011census/population_enumeration.aspx




On Thu, May 22, 2014 at 10:05 AM, Devdatta Tengshe devda...@tengshe.inwrote:

 Hi Avinash,
 Is there a list of these 640 districts along with their codes on the
 Census Website?

 While creating the data, I had not found it, and instead had to use the
 statewise codes that are available in the Administrative atlas.

 Regards,
 Devdatta Tengshe


 On Thu, May 22, 2014 at 9:41 AM, Avinash Celestine 
 avinash.celest...@gmail.com wrote:

 Hi

 This refers to the 2011 shapefiles for districts that was put together by
 datameet. I noticed that the district codes in the shapefile are numbered
 serially by state (starting from 1 for each state separately).

 However, in the Census 2011 data, districts are numbered serially from
 1-640 across the country. That is, each district in the country has its own
 unique ID code irrespective of the state to which it belongs.

 So i have added that as an extra column in the attribute table and the
 revised shapefile is attached. Would make it easier for anyone to match the
 census data to the map.

 regards

 Avinash


  --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
State,District,Name
01,001,Kupwara
01,002,Badgam
01,003,Leh(Ladakh)
01,004,Kargil
01,005,Punch
01,006,Rajouri
01,007,Kathua
01,008,Baramula
01,009,Bandipore
01,010,Srinagar
01,011,Ganderbal
01,012,Pulwama
01,013,Shupiyan
01,014,Anantnag
01,015,Kulgam
01,016,Doda
01,017,Ramban
01,018,Kishtwar
01,019,Udhampur
01,020,Reasi
01,021,Jammu
01,022,Samba
02,023,Chamba
02,024,Kangra
02,025,Lahul  Spiti
02,026,Kullu
02,027,Mandi
02,028,Hamirpur
02,029,Una
02,030,Bilaspur
02,031,Solan
02,032,Sirmaur
02,033,Shimla
02,034,Kinnaur
03,035,Gurdaspur
03,036,Kapurthala 
03,037,Jalandhar
03,038,Hoshiarpur
03,039,Shahid Bhagat Singh Nagar 
03,040,Fatehgarh Sahib
03,041,Ludhiana
03,042,Moga
03,043,Firozpur
03,044,Muktsar
03,045,Faridkot
03,046,Bathinda
03,047,Mansa
03,048,Patiala
03,049,Amritsar 
03,050,Tarn Taran
03,051,Rupnagar
03,052,Sahibzada Ajit Singh Nagar
03,053,Sangrur
03,054,Barnala
04,055,Chandigarh
05,056,Uttarkashi
05,057,Chamoli
05,058,Rudraprayag
05,059,Tehri Garhwal
05,060,Dehradun
05,061,Garhwal
05,062,Pithoragarh
05,063,Bageshwar
05,064,Almora

05,065,Champawat
05,066,Nainital
05,067,Udham Singh Nagar
05,068,Hardwar
06,069,Panchkula
06,070,Ambala
06,071,Yamunanagar
06,072,Kurukshetra
06,073,Kaithal
06,074,Karnal
06,075,Panipat
06,076,Sonipat
06,077,Jind
06,078,Fatehabad
06,079,Sirsa
06,080,Hisar
06,081,Bhiwani
06,082,Rohtak
06,083,Jhajjar
06,084,Mahendragarh
06,085,Rewari
06,086,Gurgaon
06,087,Mewat 
06,088,Faridabad
06,089,Palwal 
07,090,North West
07,091,North
07,092,North East
07,093,East
07,094,New Delhi
07,095,Central
07,096,West
07,097,South West
07,098,South
08,099,Ganganagar 
08,100,Hanumangarh
08,101,Bikaner
08,102,Churu
08,103,Jhunjhunun
08,104,Alwar
08,105,Bharatpur
08,106,Dhaulpur
08,107,Karauli
08,108,Sawai Madhopur
08,109,Dausa
08,110,Jaipur
08,111,Sikar
08,112,Nagaur
08,113,Jodhpur
08,114,Jaisalmer
08,115,Barmer
08,116,Jalor
08,117,Sirohi
08,118,Pali
08,119,Ajmer
08,120,Tonk
08,121,Bundi
08,122,Bhilwara
08,123,Rajsamand
08,124,Dungarpur
08,125,Banswara
08,126,Chittaurgarh
08,127,Kota
08,128,Baran
08,129,Jhalawar
08,130,Udaipur
08,131,Pratapgarh
09,132,Saharanpur
09,133,Muzaffarnagar
09,134,Bijnor
09,135,Moradabad
09,136,Rampur
09,137,Jyotiba Phule Nagar
09,138,Meerut
09,139,Baghpat
09,140,Ghaziabad
09,141,Gautam Buddha Nagar
09,142,Bulandshahr 
09,143,Aligarh
09,144,Mahamaya Nagar
09,145,Mathura
09,146,Agra
09,147,Firozabad
09,148,Mainpuri
09,149,Budaun
09,150,Bareilly
09,151,Pilibhit
09,152,Shahjahanpur
09,153,Kheri
09,154,Sitapur
09,155,Hardoi
09,156,Unnao
09,157

Re: [datameet] Realtime data of Delhi air pollution - Ban on linking, scraping!

2014-05-21 Thread Avinash Celestine
yes that is a little extreme :-)

Has anyone explored the CPCB's environmental data bank?

http://cpcbedb.nic.in/default.htm

(best viewed on IE)


On Wed, May 21, 2014 at 5:50 PM, Thejesh GN i...@thejeshgn.com wrote:

 Realtime data of Delhi air pollution

 http://www.dpccairdata.com/dpccairdata/display/pbView15MinData.php

 I was surprised to see this on the top. Even linking is prohibuted?

 LINKING, FRAMING, MIRRORING, SCRAPING OR DATA-MINING STRICTLY PROHIBITED.

  Thej
 --
 Thejesh GN *⏚* ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] Re: some shapefiles for delhi

2014-05-13 Thread Avinash Celestine
I should add that the delhi colonies shapefile only covers areas under
MCD, not NDMC or delhi cantt

Avinash

On Tue, May 13, 2014 at 3:52 PM, Avinash Celestine
avinash.celest...@gmail.com wrote:
 Hi

 Couple of shapefiles for delhi

 1. police stations jurisdiction (source :
 http://www.delhipolice.nic.in/knowurps.html)
 2. delhi colonies (source : http://app.mapmyindia.com/mcdApp/)

 the delhi colonies shapefile has colonies classified by unauthorised
 colony/planned development/industrial etc. However there are some
 places where such classification is not there or not decipherable
 (marked 'Un..' for instance. I have left it unchanged rather than
 impose my judgement on what those should be.

 both attached.

 Avinash

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Q on retrieving DUSIB data from G maps engine

2014-05-05 Thread Avinash Celestine
thanks Thej. I figured that the layers were 'embedded' in the map tiles
themselves rather than being rendered separately browser-side. but wasnt at
all sure how to proceed from there.

the dynamic maps engine layer solution that you suggested definitely seems
worth a shot. will give it a try...

Avinash
 On May 5, 2014 5:50 PM, Thejesh GN i...@thejeshgn.com wrote:

 Here is what I could figure:

 The layers are defined in the javascript :
 https://gist.github.com/thejeshgn/fe9853115b2b7cbc7d50#file-cluster-json

 And it is displayed using the Javascript
 https://gist.github.com/thejeshgn/fe9853115b2b7cbc7d50#file-mapviewer-js


 All the layers of type GOOGLE_MAPS_ENGINE
 And when you check the map viewer code, it creates an object of
 *google.maps.visualization.MapsEngineLayer* to display the maps,
 documentation of which is at


 https://developers.google.com/maps/documentation/javascript/mapsenginelayers

 With the following note:

 *The MapsEngineLayer constructs tiles server-side and returns the tiles to
 the client. Use this class if you don't want to re-style the Maps Engine
 data on the client side.*

 I did check the response to confirm, here is one of the tile:
 https://s3.amazonaws.com/media.thejeshgn.com/img/screenshot/tiles.png




 *What we can do:*

 On the same page using inspector construct DynamicMapsEngineLayer

 *The DynamicMapsEngineLayer (experimental) returns vector data to your
 client, along with the base map tiles. Your application can change the
 vectors' styling in response to user input or other triggers.
 (The DynamicMapsEngineLayer class is not supported in Internet Explorer 8
 and below.)*

 And then on the client side download it.?? Not sure if this reverse
 engineering works. Its not a bad idea to try.







 Thej
 --
 Thejesh GN *⏚* ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0


 On Mon, May 5, 2014 at 3:48 PM, Avinash Celestine 
 avinash.celest...@gmail.com wrote:

 Hi all

 Needed some help

 I just wanted to know if it was at all possible to retrieve the data
 (geometries etc) underlying the map linked to on this page (2nd link). Its
 a mapping of all JJ clusters in Delhi

 http://delhishelterboard.in/main/?page_id=3644

 Usually, when you load a page, among the files retrieved from the server
 is typically a geojson file etc which has the underlying geometries, but it
 doesnt seem so in this case. Do tell me if such a thing is possible  - or
 for that matter, not possible

 regards

 Avinash

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Cleaning MCD ward shepfile data

2014-04-24 Thread Avinash Celestine
let me check and get back. give me a day or two.

Avinash


On Thu, Apr 24, 2014 at 3:34 PM, Padmanabh prabhas.padman...@gmail.comwrote:

 Hi Avinash,

 First of all thank you for providing the data.

 I checked with the MCD source and I have made screenshots for the
 Dellopura Ward here (http://imgur.com/a/5ycYw). The first image is the
 offiical source, the second and third images are the two different entries
 in the shapefile which have the same name and features except the geometry.
 Would you have an idea how to handle this?

 Thanks.

 Screenshots: http://imgur.com/Mh0SKEq,b9ewUcY,EuwZL8r

 On Thursday, April 24, 2014 3:20:46 PM UTC+5:30, Avinash Celestine wrote:

 Hi

 Here's the source from which I adapted the point data

 http://app.mapmyindia.com/mcdApp/

 this should enable you to check the veracity of the areas. as a plus
 point its an official source.

 Avinash


 On Thu, Apr 24, 2014 at 2:48 PM, Padmanabh prabhas@gmail.com wrote:

 I am using MCD Ward Data as provided by Avinash Celestine here[1]. I
 imported the data into PostGIS and tried to correlate with ADR's list by
 matching the ward number for the two sets of data.

 In the shapefile, there are are often two different rows with the same
 ward number and name having different geometries. The following wards have
 two entries each:

 wardnum wardname
 211 Dellopura II
 252 Mauj Pur
 266 Karawal Nagar East
 267 Nehru Vihar
 44 Quammruddin Nagar
 7 Burari
 117 Janak Puri West
 118 Janak Puri South
 125 Mohan Garden
 126 Nawada
 127 Uttam Nagar
 128 Bindapur
 134 Nangli Sakravati
 153 Daryaganj
 171 Vasant Kunj
 193 Sriniwaspuri
 206 Okhla

 Another minor difference with the ADR data is a name mismatch in the
 following wards

 (ADR Ward Name) -- (Shapefile Ward Name)
 TRILOK PURI -- Dellopura II
 SONIA VIHAR -- Sadatpur Gujran
 PUSA -- Sat Nagar
 SITAPURI -- Manohar Park
 ANDREWS GANJ -- Adnisganj
 CHIRAG DELHI -- Greater Kailash II

 I have not been able to find any other source which provides shapefiles
 for the MCD wards, so I have nothing to compare it with. Would anyone have
 a clue how to handle the double entries in the shapefile?


 [1] (https://groups.google.com/d/msg/datameet/RPkoxyxVnRg/4UkvSQW6MU4J)

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google
 Groups datameet group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to datameet+u...@googlegroups.com.

 For more options, visit https://groups.google.com/d/optout.


  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Cleaning MCD ward shepfile data

2014-04-24 Thread Avinash Celestine
oh thats good news. no thanks necessary at all. im just glad someone
spotted the problem in the earlier file.

as to licence, wouldnt the licence that applies to the other datasets on
this group apply? I am less knowledgeable about this so I will ask others
to weigh in...
On Apr 24, 2014 7:06 PM, Padmanabh prabhas.padman...@gmail.com wrote:

 I ran a script to check if it contained any duplicates and it doesn't.
 Thanks a lot, once again.

 Btw, what license should I consider this data to be under?

 On Thursday, April 24, 2014 5:22:59 PM UTC+5:30, Avinash Celestine wrote:

 Hi

 Can you check out the attached shapefile and tell me if its ok? basically
 i re constructed the shape file again from the individual points. seemed
 the simplest way.

 i checked out dellopura in the new one and i am only getting one geometry
 but do confirm.

 rgds

 Avinash


 On Thu, Apr 24, 2014 at 3:51 PM, Avinash Celestine avinash@gmail.com
  wrote:

 let me check and get back. give me a day or two.

 Avinash


 On Thu, Apr 24, 2014 at 3:34 PM, Padmanabh prabhas@gmail.comwrote:

 Hi Avinash,

 First of all thank you for providing the data.

 I checked with the MCD source and I have made screenshots for the
 Dellopura Ward here (http://imgur.com/a/5ycYw). The first image is the
 offiical source, the second and third images are the two different entries
 in the shapefile which have the same name and features except the geometry.
 Would you have an idea how to handle this?

 Thanks.

 Screenshots: http://imgur.com/Mh0SKEq,b9ewUcY,EuwZL8r

 On Thursday, April 24, 2014 3:20:46 PM UTC+5:30, Avinash Celestine
 wrote:

 Hi

 Here's the source from which I adapted the point data

 http://app.mapmyindia.com/mcdApp/

 this should enable you to check the veracity of the areas. as a plus
 point its an official source.

 Avinash


 On Thu, Apr 24, 2014 at 2:48 PM, Padmanabh prabhas@gmail.comwrote:

 I am using MCD Ward Data as provided by Avinash Celestine here[1]. I
 imported the data into PostGIS and tried to correlate with ADR's list by
 matching the ward number for the two sets of data.

 In the shapefile, there are are often two different rows with the
 same ward number and name having different geometries. The following 
 wards
 have two entries each:

 wardnum wardname
 211 Dellopura II
 252 Mauj Pur
 266 Karawal Nagar East
 267 Nehru Vihar
 44 Quammruddin Nagar
 7 Burari
 117 Janak Puri West
 118 Janak Puri South
 125 Mohan Garden
 126 Nawada
 127 Uttam Nagar
 128 Bindapur
 134 Nangli Sakravati
 153 Daryaganj
 171 Vasant Kunj
 193 Sriniwaspuri
 206 Okhla

 Another minor difference with the ADR data is a name mismatch in the
 following wards

 (ADR Ward Name) -- (Shapefile Ward Name)
 TRILOK PURI -- Dellopura II
 SONIA VIHAR -- Sadatpur Gujran
 PUSA -- Sat Nagar
 SITAPURI -- Manohar Park
 ANDREWS GANJ -- Adnisganj
 CHIRAG DELHI -- Greater Kailash II

 I have not been able to find any other source which provides
 shapefiles for the MCD wards, so I have nothing to compare it with. Would
 anyone have a clue how to handle the double entries in the shapefile?


 [1] (https://groups.google.com/d/msg/datameet/RPkoxyxVnRg/4UkvSQ
 W6MU4J)

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google
 Groups datameet group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to datameet+u...@googlegroups.com.

 For more options, visit https://groups.google.com/d/optout.


  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google
 Groups datameet group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to datameet+u...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.



  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] transliteration tools?

2014-04-24 Thread Avinash Celestine
Hi

Are there any good tools/code etc out there which enable you to do bulk
transliteration (not translation) across languages - specifically names in
Hindi(or any other Indian language) to names in English?



thanks

Avinash

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] transliteration tools?

2014-04-24 Thread Avinash Celestine
great. thanks


On Fri, Apr 25, 2014 at 10:15 AM, Thejesh GN i...@thejeshgn.com wrote:

 I used Google's API to do this a while back

 http://thejeshgn.com/2011/02/04/batch-transliterating-names-into-kannada-using-google-api/

 Its not the best but works for most cases.

 Mine was from English to Kannada, you can probably try other way.

 --
 Thejesh GN ⏚ ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0
 On Apr 25, 2014 10:04 AM, Avinash Celestine avinash.celest...@gmail.com
 wrote:

 Hi

 Are there any good tools/code etc out there which enable you to do bulk
 transliteration (not translation) across languages - specifically names in
 Hindi(or any other Indian language) to names in English?



 thanks

 Avinash

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Security Issues with the Voter List

2014-04-11 Thread Avinash Celestine
Hi Gautam

I dont think the issue is with having the electoral roll available publicly
per se. personally, i think its better that the rolls are available in the
open, as compared with the alternative, where it is confidential, thus
leaving it open to other types of abuses.

But i do think that certain minimum safeguards should be in place - even
something as simple as a captcha code (and mentioned in the link which
started off this thread), to deter heavy bulk downloading...it seems to me
the bare minimum.

Now, will this stop me from searching for someone specific within the
voters list that i want to target, given that i have a rough idea of where
they live? certainly not.

Coupled with this is the irony, that other datasets for which there is
absolutely no reason for secrecy (atleast i cant conceive of a reason for
it - maybe its pure bureaucracy), are extremely difficult to get. Case in
point is any official version of the PC, AC shapefiles which Raphael and
others on this group have been trying so hard to create.

Raphael is right - these are complex issues. And we have barely begun to
scratch the surface of what should be done. Interestingly, in the reddit
thread linked above, there are references to the fact that  New York or
Sweden too provide vast amounts of personal information for little or no
fee...

Avinash




On Fri, Apr 11, 2014 at 11:57 AM, Gautam John gkj...@gmail.com wrote:

 Leaving aside my earlier comment as perhaps tongue in cheek, the
 electoral rolls are *meant* to be public. The Registration of Electors
 Rules, 1960 makes that clear. However, your larger point is well made.
 Maybe what needs to be done is to *de-centralise* the storage? That
 fulfils the requirements of the Registration of Electors Rules, 1960
 and making it harder to something like this.

 It says: As soon as the roll for a constituency is ready, the
 registration officer  shall publish it in draft by making a copy
 thereof available for inspection and displaying a notice in Form 5--
 (a) at his office, if it is within the constituency, and  (b) at such
 place in the constituency as may be specified by him for the purpose,
 if his office is outside the constituency ; [or in the official
 website of the Chief Electoral Officer of the concerned State:]
 [Provided that where such draft contains names of overseas electors,
 the copies of such rolls shall also be published in the Electronic
 Gazette 6 [or in the official website of the Chief Electoral Officer
 of the concerned State].]

 The Representation of the People Act, 1951 contains this: The
 Government shall, at any election to be held for the purposes of
 constituting the House of the People or the Legislative Assembly of a
 State, supply, free of cost, to the candidates of recognised political
 parties such number of copies of the electoral roll, as finally
 published ...

 Worth asking if we want political parties to have free access to it
 but not citizens.
 People Act, 1950 (43 of 1950)

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Mapping elections: open GIS shapefile drafts

2014-04-09 Thread Avinash Celestine
this is great, Raphael. thanks very much..

Avinash


On Thu, Apr 10, 2014 at 12:18 AM, Thejesh GN i...@thejeshgn.com wrote:

 This is super awesome. As of now I am going through Delhi files, I will
 let you know if I find any interesting stuff.

 Thej

 Thej
 --
 Thejesh GN *⏚* ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0


 On Wed, Apr 9, 2014 at 9:23 PM, Raphael Susewind 
 li...@raphael-susewind.de wrote:

 Dear all,

 Krishna Prasnth's plea for AC shapefiles made me decide to start pushing
 mine out there ahead of time in draft form at least. I would have loved
 to have them ready before the Bangalore hackathon, but  such things take
 time and I am quite busy.

 Still, here they come at last: draft GIS shapefiles of parliamentary
 constituencies, assembly constituencies and polling booth localities,
 published under an open license (CC-BY-NC-SA 4.0):

 http://www.raphael-susewind.de/blog/2014/mapping-indias-election

 Unlike the hackathon files, these were created using an automated
 algorithm (described in the blog post above). I intend to release (and
 long-time archive) them by end of the month, and would welcome comments
 and feedback until then: if you are familiar with both GIS and a
 specific state, it would help me a lot if you could have a look.
 Likewise, comments on the general method are very welcome.

 So far, the smaller states are online, but I will add more on a rolling
 basis - computing takes a few hours per constituency (longer for the
 larger states). I hope to complete the set by end of the week.

 Let me know if you find them useful,

 Best,
 Raphael

 --
 Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
   Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
Papers  Blog | http://www.raphael-susewind.de

 Please do consider http://www.gnupg.org for encryption (key id A5ED49AE)

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] format for release of candidates assets/ criminal records data

2014-04-05 Thread Avinash Celestine
Hi all

I recall there being a letter circulated about data requests to eci a few
days ago. I had a suggestion about the format in which the ECI releases (or
enables the release of), data contained in the  affidavits of candidates. I
wasnt sure whether this suggestion fit into that, so i broke it out into a
separate thread.

Right now, candidate affidavits are released in the worst possible format -
pdfs - not just for those looking to do some in-depth analysis into it, but
even for the journalist/voter who wants to know more about the candidate
declarations.

I wondered if there was any possibility for engaging with the ECI over the
longer term (obviously this election is out of the question) to change the
format in which candidates must declare such data. They could be required
to declare it in a couple of formats - one would be pdf of course, to
enable those interested in a specific candidate. But the other format i had
in mind was xml.

The best analogy i can think of to this, is the recent requirement by the
ministry of corporate affairs that companies file their reports with the
registrar of companies in XBRL xml format. The EC could require candidates
to do the same - they could make it easy by providing a software (could be
as simple as an excel application) which will generate the xml and the pdf
as well. (the other analogy here is the online filing of tax returns, which
requires an xml to be filed online)

The biggest criticism of any such change (that i can think of) is that
poorer candidates, who don't have good access to a computer might find it
extremely difficult to file this way. Perhaps there could be special
facilities provided at the level of district electoral officer, to enable
such candidates to make the filing.

Once the filing is done in xml, it would be easy to convert it into more
friendly formats (for humans), such as excel etc.

Anyway, that was my suggestion. Perhaps its possible to engage with EC in
the longer term to develop the standards for such a format.

regards

Avinash

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Good data from ECI - possible to extract?

2014-03-27 Thread Avinash Celestine
you could scrape it... or you could just export it to excel :-)

the icon on the top left of the screen just below the 'Home' link . just
select AC wise all states in the options

A


On Thu, Mar 27, 2014 at 12:43 PM, Thejesh GN i...@thejeshgn.com wrote:

 Very True,

 I have added it to my scraping list :) will keep you/list updated.


 Thej
 --
 Thejesh GN *⏚* ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0


 On Thu, Mar 27, 2014 at 12:25 PM, Shree D N shre...@oorvani.in wrote:

 Thej, my requirement is limited to Bangalore, I will do it myself as it
 is just three items.
 Felt it was interesting, so shared it. May be somebody wants to download
 it for all India level - it will be interesting documentation. Sometimes EC
 data vanishes / becomes untraceable after elections.



 On 27 March 2014 12:16, Thejesh GN i...@thejeshgn.com wrote:

 Its possible. Even though its a little complicated due to Business
 Objects n its frames.

 Can you tell us what do you want to do with this? If its important and
 immediate, I can probably spend some time on it.





 Thej
 --
 Thejesh GN *⏚* ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0


 On Thu, Mar 27, 2014 at 12:03 PM, Shree D N shre...@oorvani.in wrote:

 They AC-wise gender ratio of voters, voters with EPIC etc, population
 v/s electors ratio etc.

 http://www.eci-polldaymonitoring.nic.in/erollpublic/

 --
 ---
 Cheers,

 *Shree *

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google
 Groups datameet group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google
 Groups datameet group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.




 --
 ---
 Cheers,


 *Shree | Associate Editor | Oorvani Media Pvt LtdPublications:*
 citizenmatters.in | indiatogether.org
 Bangalore | Tel: +91-80-4173 7584 | Mobile: +91-95909 35559
 Follow us on Twitter https://twitter.com/citizenmatters | Follow us on
 Facebook https://www.facebook.com/citizenmatters

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] NSSO Rural Price/Wage Data

2014-03-22 Thread Avinash Celestine
Hi Fenella,

Check each of the links below. You may be able to find what you want

http://mospi.nic.in/Mospi_New/upload/concepts_golden.pdf  (older doc, so
not sure how relevant)
http://mospi.nic.in/mospi_new/upload/nsso/nss_regions.pdf
http://mospi.nic.in/Mospi_New/site/inner.aspx?status=3menu_id=48

rgds

Avinash


On Sat, Mar 22, 2014 at 11:37 AM, Fenella C fenella.carp...@gmail.comwrote:

 Hi All,

 I'm currently working with the Rural Price/Wage data (the questionnaire is
 here: http://mospi.nic.in/Mospi_New/upload/nsso/fod/Schedule_RPC.pdf),
 which I got from the NSSO. I was wondering if anyone here has any
 experience working with this data? The data is unfortunately not well
 documented, and I need to know what the state/district codes correspond to
 (for matching with other survey data), and similarly for the variable
 source code in the data. I've already tried asking their Data Processing
 Division for more information, but have not yet received a response.

 Many thanks,
 Fenella

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Parliamentary Constituency to Assembly Constituency to Ward linkages

2014-03-15 Thread Avinash Celestine
thanks. the rule, as far as i remember, is that ACs are entirely contained
within a district boundary. PCs, on the other hand, can span across
district boundaries.

A


On Sat, Mar 15, 2014 at 1:19 PM, Raphael Susewind li...@raphael-susewind.de
 wrote:

 Hi Avinash and all,

 I realized that each constituency falls within only one district in your
 file, but there are constituencies that span several districts and vice
 versa (rare, but it happens). I attached a list of those, extracted from
 polling-station data on eci-polldaymonitoring.nic.in. These are AC only,
 naturally the problem would proliferate if you aggregate to PC,

 Hope it helps,
 Raphael

 On 15.03.2014 06:57, Avinash Celestine wrote:
  hi
 
  attached an excel with AC-PC-district -states matching along with codes
  for AC-PC. I can add census district codes if you like...give me a day
  or two
 
  some states are not present - like JK... if someone could add those
  that would be great
 
  Avinash
 
 
  On Fri, Mar 14, 2014 at 10:27 PM, indro ray rayindro@gmail.com
  mailto:rayindro@gmail.com wrote:
 
  Hi Anand (Chitipothu),
  Can I know the source from where you get the polling booth and ward
  data? Is it individual for each state and does it provide the
  lat-long for the polling booths?
 
  Thanks,
  Indro
 
 
  On Wed, Mar 12, 2014 at 9:45 AM, Anand Chitipothu
  anandol...@gmail.com mailto:anandol...@gmail.com wrote:
 
 
 
  On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
  thriddas.ano...@gmail.com mailto:thriddas.ano...@gmail.com
  wrote:
 
  Hi All,
 
  In line with the discussions on elections, this is something
  I'd started working on a while back (and dropped). I was
  essentially hoping for a PC to AC to Ward mapping. As far as
  I understand, census 2011 has population data either at the
  level of the ward or the district, so if we had to run even
  rudimentary data analysis on a parliamentary or assembly
  constituency (like total population) accurately, I'm
  guessing we need to go bottom up.
 
  I had started this by attempting to
  convert
 http://eci.nic.in/eci_main/CurrentElections/CONSOLIDATED_ORDER%20_ECI%20.pdfinto
  excel (using a mixture of pattern matching in notepad++ and
  a bit of excel vb). It's time consuming (largely because
  each state follows its own convention - not standardized)
 
  Any suggestions on how one might go about this? If I wanted
  to estimate the population in a parliamentary constituency,
  or the total households, or the urban/rural split, how would
  I go about it? Is there a better method than looking at the
  above demarcation notification? Are there datasets on this
  already?
 
  New to the group, didn't find any prior discussions on
  Parliamentary to Assembly to Ward/Village demarcations.
 
 
  Hi Siddarth,
 
  The voter list PDFs have the ward info for each polling booth.
  The PDFs have the number of voter, but not the population. So it
  possible to sum up those number to get a count of number of
  voters in a PC or AC.
 
  If you want polling  booth to ward mapping, I'll be able to
  provide it.
 
  Anand
 
  --
  For more details about this list
  http://datameet.org/discussions/
  ---
  You received this message because you are subscribed to the
  Google Groups datameet group.
  To unsubscribe from this group and stop receiving emails from
  it, send an email to datameet+unsubscr...@googlegroups.com
  mailto:datameet+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.
 
 
  --
  For more details about this list
  http://datameet.org/discussions/
  ---
  You received this message because you are subscribed to the Google
  Groups datameet group.
  To unsubscribe from this group and stop receiving emails from it,
  send an email to datameet+unsubscr...@googlegroups.com
  mailto:datameet+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.
 
 
  --
  For more details about this list
  http://datameet.org/discussions/
  ---
  You received this message because you are subscribed to the Google
  Groups datameet group.
  To unsubscribe from this group and stop receiving emails from it, send
  an email to datameet+unsubscr...@googlegroups.com
  mailto:datameet+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.

 --
 Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
   Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
Papers

Re: [datameet] Parliamentary Constituency to Assembly Constituency to Ward linkages

2014-03-15 Thread Avinash Celestine
hmm yes thats true. its basically an inefficient way to engineer seat gains
- there are many other more efficient ways!

A




On Sat, Mar 15, 2014 at 2:00 PM, Srinivasan Ramani
srinivasan...@gmail.comwrote:

 Interjecting in a fantastic conversation... (Kudos to Avinash  Raphael
 and others for the efforts to mix/match AC-PC and administrative
 jurisdictions)..

 There is no direct containment of ACs within a district. Case in point is
 Delhi, where ACs dont' fit single districts at all.

 Avinash,

 Trouble with the kind of political delimitation that you talk about is
 that..it doesn't really serve any purpose. With cross-determination of
 powers at various levels - blocks, wards, districts under the bureaucracy
 vis-a-vis MLAs, changing administrative jurisdictions doesn't make much
 sense as much as doing direct gerrymandering for political vote-gaining. In
 other words, the powers of a MLA administratively is much too nebulous as
 compared to district officials across the bureaucracy and the third tier of
 democracy.


 On Sat, Mar 15, 2014 at 1:49 PM, Avinash Celestine 
 avinash.celest...@gmail.com wrote:

 unfortunately you may be right... so thats another layer of complexity...

 On a slightly related note, i have often thought, though i dont know if
 its actually possible in practice, for governments to do some delimitation
 on their own (for political purposes). For instance, if a village/area is
 near the border of a constituency, its possible through an order to bring
 it under the administrative jurisdiction of a neighbouring district. If
 that district is then served by a different AC, you have effectively done
 some delimitation of your own, without actually calling it that

 given that delimitation papers don't specify individual villages in many
 cases, it seems entirely possible to do...

 looking forward to your dataset, Raphael!

 avinash


 On Sat, Mar 15, 2014 at 1:33 PM, Raphael Susewind 
 li...@raphael-susewind.de wrote:

 Might well be the rule (I remember having read something like this,
 too), but the reality apparently differs (at least in the EC's own
 data)... Never depend on rules, check them! ;-)

 On 15.03.2014 08:58, Avinash Celestine wrote:
  thanks. the rule, as far as i remember, is that ACs are entirely
  contained within a district boundary. PCs, on the other hand, can span
  across district boundaries.
 
  A
 
 
  On Sat, Mar 15, 2014 at 1:19 PM, Raphael Susewind
  li...@raphael-susewind.de mailto:li...@raphael-susewind.de wrote:
 
  Hi Avinash and all,
 
  I realized that each constituency falls within only one district
 in your
  file, but there are constituencies that span several districts and
 vice
  versa (rare, but it happens). I attached a list of those,
 extracted from
  polling-station data on eci-polldaymonitoring.nic.in
  http://eci-polldaymonitoring.nic.in. These are AC only,
  naturally the problem would proliferate if you aggregate to PC,
 
  Hope it helps,
  Raphael
 
  On 15.03.2014 06:57, Avinash Celestine wrote:
   hi
  
   attached an excel with AC-PC-district -states matching along with
  codes
   for AC-PC. I can add census district codes if you like...give me
 a day
   or two
  
   some states are not present - like JK... if someone could add
 those
   that would be great
  
   Avinash
  
  
   On Fri, Mar 14, 2014 at 10:27 PM, indro ray
  rayindro@gmail.com mailto:rayindro@gmail.com
   mailto:rayindro@gmail.com mailto:rayindro@gmail.com
  wrote:
  
   Hi Anand (Chitipothu),
   Can I know the source from where you get the polling booth
 and
  ward
   data? Is it individual for each state and does it provide the
   lat-long for the polling booths?
  
   Thanks,
   Indro
  
  
   On Wed, Mar 12, 2014 at 9:45 AM, Anand Chitipothu
   anandol...@gmail.com mailto:anandol...@gmail.com
  mailto:anandol...@gmail.com mailto:anandol...@gmail.com
 wrote:
  
  
  
   On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
   thriddas.ano...@gmail.com
  mailto:thriddas.ano...@gmail.com mailto:
 thriddas.ano...@gmail.com
  mailto:thriddas.ano...@gmail.com
   wrote:
  
   Hi All,
  
   In line with the discussions on elections, this is
  something
   I'd started working on a while back (and dropped). I
 was
   essentially hoping for a PC to AC to Ward mapping. As
  far as
   I understand, census 2011 has population data either
  at the
   level of the ward or the district, so if we had to
 run
  even
   rudimentary data analysis on a parliamentary or
 assembly
   constituency (like total population) accurately, I'm
   guessing we need

Re: [datameet] Parliamentary Constituency to Assembly Constituency to Ward linkages

2014-03-13 Thread Avinash Celestine
oh i see so its worse than i thought :-(

you are right. I doubt the EC will fix it (for entirely good reasons on
their part - they have more important things to worry about).

I am trying a couple of alternative methods. Let me see if anything works -
I will report back. For now, the OCR seems to be the best option.

Avinash




On Thu, Mar 13, 2014 at 12:33 PM, Raphael Susewind 
li...@raphael-susewind.de wrote:

 Hey Avinash,

 yep - thats what I figured, too. Not only misplaced matras (those could
 be rearranged), but a real garbling, which cannot be resolved as far as
 I see. Worse, there isnt even a clear pattern - for a few
 constituencies, I fed the Voter ID (which is in latin script) to the
 search roll details by voter ID function on the CEO website, which
 returns the properly written unicode name. I then compared garbled name
 and unicode name to see if there are any statistical regularities - yet
 unfortunately, there are a thousand ways of garbling Avinash - its not
 always Abniszhaa.

 The only solution I can think of is the following (but I have not
 implemented it): train TesserAct (an IndicScript OCR) with the exact
 font used in the PDF reports, so that it almost perfectly recognizes
 something written in this font (this was a stumblestone for me, rather
 complicated work), then extract images of text areas of interest, and
 run them through OCR. If you want to give it a shot...

 Otherwise, we could only try to convince the EC to fix the bug in
 Crystal Reports, and re-generate all PDFs - which is highly unlikely,
 they have more important things to do right now (the PDFs display and
 print alright, after all, just text extraction does not work - they
 would perhaps even consider it a feature rather than a bug).

 It might be useful to compile a list of states where this problem occurs
 - I have seen it in Gujarat and UP for sure, but don't know whether it
 happens everywhere,

 Best,
 Raphael

 On 13.03.2014 05:35, Avinash Celestine wrote:
  well i checked out the unicode table and it only confirms what we knew
  anyway... that there's duplication of unicode hex values for different
  characters...
 
  So i guess its back to the drawing board.
 
 
  On Thu, Mar 13, 2014 at 9:43 AM, Avinash Celestine
  avinash.celest...@gmail.com mailto:avinash.celest...@gmail.com
 wrote:
 
  Hi Raphael
 
  In fact the problem with the UP rolls is exactly what I am grappling
  with now. It seems to me that one way is to look at the exact
  mapping of Unicode characters embedded within the files. One way of
  generating such maps is to use a plugin like PDFLIBs font reporter
  which works with Adobe
  Acrobat(http://www.pdflib.com/products/fontreporter/). Have you
  tried out this method and did it work for you? Do tell me if you (or
  anyone else) has given it a shot. I am planning to give it a go
  atleast...
 
  I have attached a sample roll (of an AC in Agra), along with the
  generated font report if anyone wants to give it a look
 
  A closer look at the roll shows that the main problem seems to be
  with the Devanagari 'matras' which are not rendering correctly when
  you cut and paste
 
  regards
 
  Avinash
 
 
  On Wed, Mar 12, 2014 at 12:19 PM, Raphael Susewind
  li...@raphael-susewind.de mailto:li...@raphael-susewind.de
 wrote:
 
  Hey Siddhart, and Anand,
 
  I, too, am really interested in this, but have not made much
  progress
  yet. I think there are two ways to do this, neither of which is
  straightforward.
 
  The extract ward/village mentioned in roll PDF strategy is one
  option.
  Depending on raw data, this can however be cumbersome (one
  source in the
  vernacular, one in latin script, etc); I know a couple of
  scholars who
  attempt to do this and they are stuck all the time, having had to
  manually match rather frequently (which is a pain given that
  there are
  800.000 or so polling stations).
 
  Currently, we have the additional problem that many of the
  current roll
  PDFs - for instance in UP - are broken: one cannot copy-paste (or
  pdftotext, or extract through whatever means) from them, chiefly
  because
  the ToUnicodeCMap is corrupted by the version of CrystalReports
  the ECI
  is using. There is no real workaround other than reverse-OCR,
  which is a
  pain-in-the-a**. Let me know if you figure another way...
 
  The second option would be a very different strategy, namely GIS
  matching through next neighbour analysis: what is the closest
  Census
  village/ward around that particular polling booth (or the other
 way
  round - the computational challenge is to match ALL booths to at
  least
  one ward AND vice versa). Unfortunately

Re: [datameet] Parliamentary Constituency to Assembly Constituency to Ward linkages

2014-03-12 Thread Avinash Celestine
well i checked out the unicode table and it only confirms what we knew
anyway... that there's duplication of unicode hex values for different
characters...

So i guess its back to the drawing board.


On Thu, Mar 13, 2014 at 9:43 AM, Avinash Celestine 
avinash.celest...@gmail.com wrote:

 Hi Raphael

 In fact the problem with the UP rolls is exactly what I am grappling with
 now. It seems to me that one way is to look at the exact mapping of Unicode
 characters embedded within the files. One way of generating such maps is to
 use a plugin like PDFLIBs font reporter which works with Adobe Acrobat(
 http://www.pdflib.com/products/fontreporter/). Have you tried out this
 method and did it work for you? Do tell me if you (or anyone else) has
 given it a shot. I am planning to give it a go atleast...

 I have attached a sample roll (of an AC in Agra), along with the generated
 font report if anyone wants to give it a look

 A closer look at the roll shows that the main problem seems to be with the
 Devanagari 'matras' which are not rendering correctly when you cut and paste

 regards

 Avinash


 On Wed, Mar 12, 2014 at 12:19 PM, Raphael Susewind 
 li...@raphael-susewind.de wrote:

 Hey Siddhart, and Anand,

 I, too, am really interested in this, but have not made much progress
 yet. I think there are two ways to do this, neither of which is
 straightforward.

 The extract ward/village mentioned in roll PDF strategy is one option.
 Depending on raw data, this can however be cumbersome (one source in the
 vernacular, one in latin script, etc); I know a couple of scholars who
 attempt to do this and they are stuck all the time, having had to
 manually match rather frequently (which is a pain given that there are
 800.000 or so polling stations).

 Currently, we have the additional problem that many of the current roll
 PDFs - for instance in UP - are broken: one cannot copy-paste (or
 pdftotext, or extract through whatever means) from them, chiefly because
 the ToUnicodeCMap is corrupted by the version of CrystalReports the ECI
 is using. There is no real workaround other than reverse-OCR, which is a
 pain-in-the-a**. Let me know if you figure another way...

 The second option would be a very different strategy, namely GIS
 matching through next neighbour analysis: what is the closest Census
 village/ward around that particular polling booth (or the other way
 round - the computational challenge is to match ALL booths to at least
 one ward AND vice versa). Unfortunately, Census village/ward lat/long is
 not in the public domain, as far as I see - and using proprietary data
 to do the matching is legally complicated (even if one redistributes
 only the matching result and not the proprietary data).

 My 5 cents,
 Let us know of any progress,

 Raphael

 On 12.03.2014 05:17, Anand Chitipothu wrote:
 
  On Wed, Mar 12, 2014 at 9:45 AM, Anand Chitipothu anandol...@gmail.com
  mailto:anandol...@gmail.com wrote:
 
 
 
  On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
  thriddas.ano...@gmail.com mailto:thriddas.ano...@gmail.com
 wrote:
 
  Hi All,
 
  In line with the discussions on elections, this is something I'd
  started working on a while back (and dropped). I was essentially
  hoping for a PC to AC to Ward mapping. As far as I understand,
  census 2011 has population data either at the level of the ward
  or the district, so if we had to run even rudimentary data
  analysis on a parliamentary or assembly constituency (like total
  population) accurately, I'm guessing we need to go bottom up.
 
  I had started this by attempting to
  convert
 http://eci.nic.in/eci_main/CurrentElections/CONSOLIDATED_ORDER%20_ECI%20.pdfinto
  excel (using a mixture of pattern matching in notepad++ and a
  bit of excel vb). It's time consuming (largely because each
  state follows its own convention - not standardized)
 
  Any suggestions on how one might go about this? If I wanted to
  estimate the population in a parliamentary constituency, or the
  total households, or the urban/rural split, how would I go about
  it? Is there a better method than looking at the above
  demarcation notification? Are there datasets on this already?
 
  New to the group, didn't find any prior discussions on
  Parliamentary to Assembly to Ward/Village demarcations.
 
 
  Hi Siddarth,
 
  The voter list PDFs have the ward info for each polling booth. The
  PDFs have the number of voter, but not the population. So it
  possible to sum up those number to get a count of number of voters
  in a PC or AC.
 
  If you want polling  booth to ward mapping, I'll be able to provide
 it.
 
 
  btw, Anand Doshi has already parsed that PDF. The results are available
 at:
 
  https://gist.github.com/anandpdoshi/9448203
 
  Anand
  P.S: uff, so many Anands on this list

Re: [datameet] Power Tariff Data

2013-11-28 Thread Avinash Celestine
the CEA collects tariffs and duties of power across states in one place so
you might want to look at this source first...

http://cea.nic.in/eandc_wing.html

last two links on page

Avinash


On Thu, Nov 28, 2013 at 9:08 PM, Naveen Gattu naveen.ga...@gramener.comwrote:

 Thanks Venkat , this was helpful
 Regards,
 Naveen
 Sent from Handheld device
 --
 *From: * Venkata Pingali ping...@gmail.com
 *Sender: * datameet@googlegroups.com
 *Date: *Thu, 28 Nov 2013 17:04:52 +0530
 *To: *datameet@googlegroups.com
 *ReplyTo: * datameet@googlegroups.com
 *Subject: *Re: [datameet] Power Tariff Data

 Two sources:

 1. Shunglu and Chaturvedi committee reports on
 planning commission site.

 2. State and central electricity regulatory commissions
 (CERC for central level, MERC for Maharashtra etc)

 -Venkata




 On Thu, Nov 28, 2013 at 4:59 PM, Naveen Gattu 
 naveen.ga...@gramener.comwrote:

 HI -

 Does anyone have Electricity Power tariff data ( increase in power
 charges) across states for last few years?. Appreciate if you can point to
 sources also.

 Thanks!

 --
 Regards,
 Naveen


 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.


  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [datameet] Map of Lok Sabha Constituencies?

2013-10-11 Thread Avinash Celestine
if you are not looking for georeferenced maps then the pdfs on this site
might help. you can clean the pdfs in inkscape or adobe illustrator...

http://ecimaps.gisserver1.nic.in/

these are the only maps that i known of, in the public domain, which have
post delimitation parliamentary and assembly constituencies...

Avinash


On Fri, Oct 11, 2013 at 4:25 PM, Rushabh Mehta rme...@gmail.com wrote:

 Hello,

 I am looking for a map of lok sabha constituencies (I am looking for
 Mumbai specifically, but any other guides will also be helpful) - I have
 checked a lot of sites but the maps are badly made and many are grossly
 wrong.

 If I manage to get hi-res images, I will trace the ploygons (of Mumbai
 atleast) and post them online!

 thanks,
 Rushabh


 W: https://erpnext.com
 T: @rushabh_mehta

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


[datameet] Two easy-to-use online mapping tools

2013-09-03 Thread Avinash Celestine
Hi All

Thought i would share a couple of (free!) tools available online which
facilitate mapping. I came across these tools while doing maps for my blog (
datastories.in).

indiemapper.com
if you have a shapefile, its a great tool. upload the shapefile, and
attribute data and it generates the chloropleth, or even dot density map.
provides east export to svg, jpg etc. also comes in handy if you just want
to covert shp to jpg. decent UI

shpescape.com/mix
tool to convert bulky shapefiles to smaller web-friendly topojson format.
this is a great help because topojson files are typically much smaller than
the corresponding geojson or shp.



will share other tools as i come across them.

regards

Avinash

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [datameet] Postal book of information

2013-06-11 Thread Avinash Celestine
interesting. incidentally I understand that with the networking of post
offices, there is a substantial amount of data that the postal department
has sitting on its servers (unfortunately not public as far as I know),
relating to domestic remittances through money orders. Such information
would be fascinating for many reasons, such as understanding migration
patterns.

Avinash


On Tue, Jun 11, 2013 at 12:21 PM, Vaibhav P vaibhav.li...@gmail.com wrote:

 +1. Very interesting and useful.

 Vaibhav (@ivabz)




 On Tue, Jun 11, 2013 at 12:02 PM, Thejesh GN i...@thejeshgn.com wrote:

 Very very interesting set of data

 http://www.indiapost.gov.in/Pdf/Book_of_Information_2010-2011.pdf


 Thej
 --
 Thejesh GN | ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.




  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.




-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




[datameet] an idea of how difficult it is to do detailed mapping in India

2013-05-15 Thread Avinash Celestine
this is an interesting paper on an attempt to do detailed sub-district
level mapping in South India.

http://www.demographie.net/sifp/Output/methodology.pdf

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [datameet] Provocative topic for the group's feedback! :You Can't Just Hack Your Way to Social Change

2013-04-30 Thread Avinash Celestine
this is absolutely correct. and on a related note it is also true that
there is a political context to many kinds of data analysis and such a
context is unavoidable...

for instance many of you will be aware of the Reinhart-Rogoff paper on
growth and debt levels which was recently shown to have serious
methodological flaws...however the inference from this has been made that
had the paper not made these mistakes, many of the countries in the EU
would not have been put through punishing levels of austerity. the
assumption here being that policies on austerity were heavily based on this
one paper. this is quite naive. in many such cases the political decision
has already been taken. the data part of it is just an input.

And its not at all clear that had the data showed something else, a
different decision would have been taken

Avinash
On Apr 30, 2013 9:32 PM, Avinash Celestine avinash.celest...@gmail.com
wrote:

 this is absolutely correct. and on a related note
 On Apr 30, 2013 12:06 PM, Vaishnavi Jayakumar 
 jayakumar.vaishn...@gmail.com wrote:

  You Can't Just Hack Your Way to Social Change
 by Jake Porway  |   1:00 PM March 7, 2013

 blogs.hbr.org
 http://blogs.hbr.org/cs/2013/03/you_cant_just_hack_your_way_to.html

 We have a lot of data, but we have no idea what we should do with it.
 The director of the foundation looked plaintively across the table at me.
 We were thinking of having a hackathon, or maybe running an app
 competition, he smiled. His co-workers nodded eagerly. I shuddered.

 I have this conversation about once a week. Awash in data, an
 organization — be it a healthcare nonprofit, a government agency, or a tech
 company — desperately wants to capitalize on the insights that the Big
 Data hype has promised them. Increasingly, they are turning to hackathons
 — weekend events where coders, data geeks, and designers conspire to build
 software solutions in just 48 hours — to get new ideas and fill their
 capacity gap. There's a lot to be said for hackathons: They give the
 technology community great social opportunities and reward them with money
 and fame for their solutions, and companies get free access to a community
 of diligent experts they otherwise wouldn't know how to reach. For all of
 these upsides, however, hackathons are not ideal for solving big problems
 like reducing poverty, reforming politics, or improving education and, when
 they're used to interpret data for social impact, they can be downright
 dangerous.

 At DataKind http://datakind.org/ we run DataDives, weekend events
 that team nonprofits with pro bono data scientists to solve tough social
 problems. They are not easy to get right. Data events like these require
 special requirements beyond your average hackathon. You need to have a
 clear problem definition, include people who understand the data not just
 data analysis, and be deeply sensitive with the data you're analyzing.

 Any data scientist worth their salary will tell you that you should start
 with a question, NOT the data. Unfortunately, data hackathons often lack
 clear problem definitions. Most companies think that if you can just get
 hackers, pizza, and data together in a room, magic will happen. This is the
 same as if Habitat for Humanity gathered its volunteers around a pile of
 wood and said, Have at it! By the end of the day you'd be left with a
 half of a sunroom with 14 outlets in it.

 Without subject matter experts available to articulate problems in
 advance, you get results like those from the Reinvent Green 
 Hackathonhttp://www.nyc.gov/html/digital/html/opengov/reinventgreen.shtml.
 Reinvent Green was a city initiative in NYC aimed at having technologists
 improve sustainability in New York. Winners of this hackathon included an
 app to help cyclists bikepool together and a farmer's market inventory
 app. These apps are great on their own, but they don't solve the city's
 sustainability problems. They solve the participants' problems because as a
 young affluent hacker, my problem isn't improving the city's recycling
 programs, it's finding kale on Saturdays.

 To avoid this problem, organizations have to be willing to put time and
 effort into scoping problems with the technologists ahead of time. Reinvent
 Green could have invited recycling managers, urban planners, or other
 experts to converse with the hackers before the event. Organizations also
 need to be willing to get down-and-dirty with the data geeks during the
 weekend. It's not enough to just throw the data over the wall and hope for
 the best.

 Subject matter experts are doubly needed to assess the results of the
 work, especially when you're dealing with sensitive data about human
 behavior. As data scientists, we are well equipped to explain the what of
 data, but rarely should we touch the question of why on matters we are
 not experts in. Take for example a finding from the data team at Uber 
 thatprostitution
 arrests increased on 
 Wednesdayshttp://blog.uber.com

Re: [datameet] Is data behind non-monetary paywalls really open?

2013-02-18 Thread Avinash Celestine
there was some movement towards the making of govt funded work, or
hopefully just govt work, copyright-free. there were many proposals before
the govt when the last amendment to the copyright act came up for revision.
However such proposals were rejected and they didnt go through.

It is worth taking a look at the copyright act to see how govt works/govt
funded works are treated, and you'll find that fairly interesting.

copyright in govt works: 60 years

section 52 of copyright act- certain acts *not* to be infringement of
copyright  (emphasis added):

(q*) the reproduction or publication of-*

(i) any matter which has been published in any Official Gazette *except* an
Act of a Legislature;

(ii) any Act of a Legislature *subject to the condition* that such Act is
reproduced or published together
with any commentary thereon or any other original matter;

(iii) the report of any committee, commission, council, board or other like
body appointed by the
Government if such report has been laid on the Table of the Legislature, *
unless* the reproduction or
publication of such report is prohibited by the Government;

(iv) any judgement or order of a court, tribunal or other judicial
authority, unless the reproduction or
publication of such judgment or order is prohibited by the court, the
tribunal or other judicial authority,
as the case may be;


*(r) the production or publication of a translation in any Indian language
of an Act of a Legislature and*
*of any rules or orders made thereunder-*

(i) if no translation of such Act or rules or orders in that language has
previously been produced or
published by the Government; or

(ii) where a translation of such Act or rules or orders in that language
has been produced or
published by the Government, if the translation is not available for sale
to the public:

Provided that such translation contains a statement at a prominent place to
the effect that the
translation has not been authorised or accepted as authentic by the
Government;



If you carefully go through the law excerpted above, and pay particular
attention to the exceptions there, you'll come to a set of what can only be
called 'interesting' conclusions... :-)

for instance if i were to take a copy of a govt act and put it up on a
website of mine, without any value added from my side, that's technically a
violation of copyright.

A



On Tue, Feb 19, 2013 at 9:32 AM, L. Shyamal lshya...@gmail.com wrote:

 I support Karthik's point and think there is a long overdue case to call
 for all publicly funded work (incl. government works, publications, data
 etc.) to be explicitly released into the public domain (as defined here
 http://law.yourdictionary.com/articles/what-is-public-domain.html ) or
 failing that freely licensed (given that the Copyright laws and other laws
 are too inertia bound).

 If there is to be a single point agenda for any knowledge related
 organization, it would certainly be to seek change in the clause related to
 work of government so as to be along the lines of 17 USC 105 - a useful
 discussion on it can be found at
 http://www.law.cornell.edu/uscode/text/17/105

 best wishes
 Shyamal

 http://muscicapa.blogspot.com



  Is data behind non-monetary paywalls really 
 open?http://groups.google.com/group/datameet/t/e0c7882e428fcc6a

Karthik Shashidhar karthik.shashid...@gmail.com Feb 18 10:30AM
+0530


What do you think of websites or organizations that ask you to fill
up an
elaborate form or write an elaborate research proposal before they
share
their data with you? Do you think such data is really open?

I find monetary paywalls more egalitarian than such artificial
paywalls
because in the former case the data is available to anyone who pays.
In
case of artificial paywalls though (I will call them paywalls since
you
effectively pay by massaging the ego of the person who controls the
data)
there is no guarantee that you will get the data and people
controlling it
can reject your requests for arbitrary reasons.

Don't you think there is a case to campaign for data behind such
artificial
paywalls to be put in the public domain and made really free? At
least we
should campaign for all data produced as part of publicly funded
research
to be made publicly available in an easily downloadable and usable
format.

Regards
Karthik





  --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.




-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this