[datameet] Survey of India v/s GMaps

2014-12-16 Thread Thejesh GN
saga continues, this one is the latest by SoI on Facebook[1], I have
archived[2], concerns to domain .com and .cn. I couldn find the same
on notices section of SoI[3]


Reproducing the post:

[QUOTE]
Survey of India, Office of the Surveyor General of India,
December 12 at 8:34pm ·
It is observed that M/s Google Map is wrongly depicting external
boundaries of India on maps hosted by them on different domains,
namely http://www.google.cn/maps/@25.7377892,88.726647,4z
andhttps://www.google.com/maps/@25.1367452,80.1073483,4z , which are
not equivalent to Government of India authentication.
The correct official boundaries of India has been shown
athttp://www.surveyofindia.gov.in/fil…/Correct_India_Map_1.pdf
In this regard, on the instructions of Department of Science 
Technology, Survey of India has officially filled a FIR with SHO, PS
Dalanwala, Dehradun on 12-12-2014 at 18:00 hrs requesting them to take
appropriate legal action against M/s Google Maps for displaying
incorrect map of India.
[/QUOTE]


[1] 
https://www.facebook.com/permalink.php?story_fbid=894090357277177id=277739112245641fref=nf
[2] https://archive.today/BCKxo
[3] http://www.surveyofindia.gov.in/notices/archive



Thej
--
Thejesh GN ⏚ ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Survey of India v/s GMaps

2014-12-16 Thread Shree D N
Is this official page of SOI? Saw some photos from Tokyo etc posted by the
page...

On 16 December 2014 at 14:58, Thejesh GN i...@thejeshgn.com wrote:

 saga continues, this one is the latest by SoI on Facebook[1], I have
 archived[2], concerns to domain .com and .cn. I couldn find the same
 on notices section of SoI[3]


 Reproducing the post:

 [QUOTE]
 Survey of India, Office of the Surveyor General of India,
 December 12 at 8:34pm ·
 It is observed that M/s Google Map is wrongly depicting external
 boundaries of India on maps hosted by them on different domains,
 namely http://www.google.cn/maps/@25.7377892,88.726647,4z
 andhttps://www.google.com/maps/@25.1367452,80.1073483,4z , which are
 not equivalent to Government of India authentication.
 The correct official boundaries of India has been shown
 athttp://www.surveyofindia.gov.in/fil…/Correct_India_Map_1.pdf
 In this regard, on the instructions of Department of Science 
 Technology, Survey of India has officially filled a FIR with SHO, PS
 Dalanwala, Dehradun on 12-12-2014 at 18:00 hrs requesting them to take
 appropriate legal action against M/s Google Maps for displaying
 incorrect map of India.
 [/QUOTE]


 [1]
 https://www.facebook.com/permalink.php?story_fbid=894090357277177id=277739112245641fref=nf
 [2] https://archive.today/BCKxo
 [3] http://www.surveyofindia.gov.in/notices/archive



 Thej
 --
 Thejesh GN ⏚ ತೇಜೇಶ್ ಜಿ.ಎನ್
 http://thejeshgn.com
 GPG ID :  0xBFFC8DD3C06DD6B0

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.



-- 
---
Cheers,

*Shree | Associate Editor | *
*Oorvani Foundation**Citizen Matters http://bangalore.citizenmatters.in -
Bangalore's own online news magazine*
Bangalore | Tel: +91-80-4173 7584 | Mobile: +91-95909 35559
Follow us on Twitter https://twitter.com/citizenmatters | Follow us on
Facebook https://www.facebook.com/citizenmatters

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Survey of India v/s GMaps

2014-12-16 Thread Thejesh GN
On 16 December 2014 at 15:04, Shree D N wrote:
 Is this official page of SOI? Saw some photos from Tokyo etc posted by the
 page...

Its not a confirmed page. So I am not 100% sure.

That said NDTV has reported on this
http://ibnlive.in.com/news/incorrect-map-of-india-on-google-websites-government/517821-3.html





Thej
--
Thejesh GN ⏚ ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Re: India 2001 census data village-wise

2014-12-16 Thread Anand Chitipothu
On Tue, Dec 16, 2014 at 11:56 AM, Rick Morgan rick.morg...@gmail.com
wrote:

 Hello,

 Are you still looking for this data? With a rather complex Python code I
 scraped  http://www.censusindia.gov.in/Census_Data_2001/
 Village_Directory/View_data/Village_Profile.aspx
 http://www.google.com/url?q=http%3A%2F%2Fwww.censusindia.gov.in%2FCensus_Data_2001%2FVillage_Directory%2FView_data%2FVillage_Profile.aspxsa=Dsntz=1usg=AFQjCNGcG6qYyc27owTt4u8QxU1XoKEWzA
  and
 collected about 90% of all the records over the course of 6 months... The
 other 10% are corrupt or otherwise missing. I had to move to other tasks,
 so I walked away, but I plan to collect the remaining records within the
 next few months.

 I have 473,514 complete records. Currently, the majority are in *.html
 format. The file is rather large, so I will have to send it over in
 batches.


You could upload them to archive.org and share a link here.

Anand

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Re: India 2001 census data village-wise

2014-12-16 Thread Rick Morgan
Yeah. I found that late last night I am not sure what projection it is on,
as the borders are shifted from my known-good district map. I am going to
play with this a bit more. He mentioned that he got it from the DevInfo
program. When you export the *.kmz and overlay it in google earth, the
problem remains. I might end up going through a handful of CRS/projections,
those that I know are often used for India, to see if this helps.

That discussion was a bit heated. I am going to assume that this is not the
norm.

Cheers,
Rick Morgan


On Tue, Dec 16, 2014 at 9:52 PM, Sharad Lele sharad.l...@gmail.com wrote:

 I remembered that Justin had made available an all India taluka/CD block
 shapefile.

 Sharad

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to a topic in the
 Google Groups datameet group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/datameet/WcgiHq0trBc/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Re: India 2001 census data village-wise

2014-12-16 Thread Rick Morgan
There are over +400,000 records, so the upload will take a very long time 
and will require more RAM than I can currently spare. (I have a rather 
extensive Bayesian measurement model running right now that take priority). 
Do you know Python? I could give you my scrapping code.  

On Tuesday, December 16, 2014 10:26:49 PM UTC-5, Anand Chitipothu wrote:

 On Tue, Dec 16, 2014 at 11:56 AM, Rick Morgan rick.m...@gmail.com 
 javascript: wrote:

 Hello, 

 Are you still looking for this data? With a rather complex Python code I 
 scraped  http://www.censusindia.gov.in/Census_Data_2001/
 Village_Directory/View_data/Village_Profile.aspx 
 http://www.google.com/url?q=http%3A%2F%2Fwww.censusindia.gov.in%2FCensus_Data_2001%2FVillage_Directory%2FView_data%2FVillage_Profile.aspxsa=Dsntz=1usg=AFQjCNGcG6qYyc27owTt4u8QxU1XoKEWzA
  and 
 collected about 90% of all the records over the course of 6 months... The 
 other 10% are corrupt or otherwise missing. I had to move to other tasks, 
 so I walked away, but I plan to collect the remaining records within the 
 next few months.

 I have 473,514 complete records. Currently, the majority are in *.html 
 format. The file is rather large, so I will have to send it over in 
 batches. 


 You could upload them to archive.org and share a link here.

 Anand
  

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] Re: India 2001 census data village-wise

2014-12-16 Thread Rick Morgan

Here. Is the code. If you want the *.html and JSON data you can scrape it 
with this. That said, I will work on converting it all to *.csv as soon as 
my R console is free. I hope this helps. Cheers. 

On Saturday, March 1, 2014 12:23:17 AM UTC-5, Fenella C wrote:

 Hello everyone, 

 I am wondering if any of you have the village-wise 2001 Indian census data 
 in a spreadsheet (or similar) format? I am basically looking for 
 information at the village level from the 2001 census (e.g., population of 
 the village, number of households in the village, etc.)

 The data is available online at the census website here 
 http://www.censusindia.gov.in/Census_Data_2001/Village_Directory/View_data/Village_Profile.aspx
  
 but it is not available in a spreadsheet. I have already tried web scraping 
 the data, but it is painfully slow, so I'm wondering if I can find it 
 elsewhere.

 Many thanks,
 Fenella


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
#!python2.7

Download all data from the 2001 India Census: http://www.censusindia.gov.in/Census_Data_2001/Village_Directory/View_data/Village_Profile.aspx.

- saves the html for each village to a file in directory html/
- Saves a list of dict objects with metadata about each village to metadata.json


import json
import os
import re
import shutil
import sys
import time
from os import path
import gzip
import codecs

import requests
from bs4 import BeautifulSoup

def write_gzipped(filename, content):
with gzip.open(filename, 'wb') as f:
f.write(content)

class InvalidPostback(Exception):
 def __init__(self, response, data):
 self.response = response
 self.data = data
 
 def __str__(self):
 message = Cookies: %s\nForm: %s\nText: %s % (dict(self.response.cookies),
self.data,
BeautifulSoup(self.response.text).prettify())
 return message

STATES = {
	'01': Jammu amp; Kashmir,
	'02': Himachal Pradesh,
	'03': Punjab,
	'04': Chandigarh,
	'05': Uttarakhand,
	'06': Haryana,
	'07': NCT of Delhi,
	'08': Rajasthan,
	'09': Uttar Pradesh,
	'10': Bihar,
	'11': Sikkim,
	'12': Arunachal Pradesh,
	'13': Nagaland,
	'14': Manipur,
	'15': Mizoram,
	'16': Tripura,
	'17': Meghalaya,
	'18': Assam,
	'19': West Bengal,
	'20': Jharkhand,
	'21': Orissa,
	'22': Chhattisgarh,
	'23': Madhya Pradesh,
	'24': Gujarat,
	'25': Daman amp; Diu,
	'26': Dadra amp; Nagar Haveli,
	'27': Maharastra,
	'28': Andhra Pradesh,
	'29': Karnataka,
	'30': Goa,
	'31': Lakshadweep,
	'32': Kerala,
	'33': Tamil Nadu,
	'34': Puducherry,
	'35': Andaman and Nicobar Islands,
}

def get_aspx_stuff(soup):
 Pull out the current values of the form from a BeautifulSoup object of the webpage 
viewstate = soup.select(#__VIEWSTATE)[0]['value']
eventvalidation = soup.select(#__EVENTVALIDATION)[0]['value']
# lastfocus = soup.select(#__LASTFOCUS)[0]['value']
# eventtarget = soup.select(#__EVENTTARGET)[0]['value']
# eventargument = soup.select(#__EVENTARGUMENT)[0]['value']
drpState = soup.select('#ctl00_Body_Content_drpState')[0].find('option', selected = True)['value']
drpDistrict = soup.select('#ctl00_Body_Content_drpDistrict')[0].find('option', selected = True)['value']
drpSubDistrict = soup.select('#ctl00_Body_Content_drpSubDistrict')[0].find('option', selected = True)['value']
drpVillage = soup.select('#ctl00_Body_Content_drpVillage')[0].find('option', selected = True)['value']
ret = {'__VIEWSTATE' : viewstate,
   '__EVENTVALIDATION' : eventvalidation,
   'ctl00$Body_Content$drpState' : drpState,
   'ctl00$Body_Content$drpDistrict' : drpDistrict,
   'ctl00$Body_Content$drpSubDistrict' : drpSubDistrict,
   'ctl00$Body_Content$drpVillage' : drpVillage}
# '__EVENTTARGET' : eventtarget,
# '__EVENTARGUMENT' : eventargument,
# '__LASTFOCUS' : lastfocus}
return ret

def get_states(soup):
 Extract list of states from the webpage 
ret = {}
for x in soup.select('#ctl00_Body_Content_drpState')[0].findAll('option'):
value = x['value']
if value != 'null':
ret[value] = x.text
return ret

def get_districts(soup):
 Extract list of districts from the webpage 
ret = {}
for x in soup.select('#ctl00_Body_Content_drpDistrict')[0].findAll('option'):
value = x['value']
if value != 'null':
ret[value] = x.text
return ret

def get_sub_districts(soup):
 Extract list of sub districts from the webpage 
ret = {}
for x in 

Re: [datameet] Re: India 2001 census data village-wise

2014-12-16 Thread Sharad Lele [शरच्चंद्र लेले]
No problem at all. There is no hurry on this whatsoever. I should 
mention that I already have purchased CDs for 2001 data for several 
states where I do my research.


Sharad

On 17-Dec-14 9:18 AM, Rick Morgan wrote:


Here. Is the code. If you want the *.html and JSON data you can scrape 
it with this. That said, I will work on converting it all to *.csv as 
soon as my R console is free. I hope this helps. Cheers.


On Saturday, March 1, 2014 12:23:17 AM UTC-5, Fenella C wrote:

Hello everyone,

I am wondering if any of you have the village-wise 2001 Indian
census data in a spreadsheet (or similar) format? I am basically
looking for information at the village level from the 2001 census
(e.g., population of the village, number of households in the
village, etc.)

The data is available online at the census website here

http://www.censusindia.gov.in/Census_Data_2001/Village_Directory/View_data/Village_Profile.aspx

http://www.censusindia.gov.in/Census_Data_2001/Village_Directory/View_data/Village_Profile.aspx
but it is not available in a spreadsheet. I have already tried web
scraping the data, but it is painfully slow, so I'm wondering if I
can find it elsewhere.

Many thanks,
Fenella

--
Datameet is a community of Data Science enthusiasts in India. Know 
more about us by visiting http://datameet.org

---
You received this message because you are subscribed to a topic in the 
Google Groups datameet group.
To unsubscribe from this topic, visit 
https://groups.google.com/d/topic/datameet/WcgiHq0trBc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to 
datameet+unsubscr...@googlegroups.com 
mailto:datameet+unsubscr...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups datameet group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.