Dear Mark,
Happy to know about your interest. I am also quite interested in these 
issues, having worked on 1991, 2001 and 2011 census datasets and their 
spatial representation (at least for Karnataka and some other states). 
There are many issues, both with the census datasets themselves and with 
the spatial boundary datasets released by Meiyyappan et al. I may not be 
able to lay out everything immediately, because of being in the throes of 
some deadlines, but hope to go through your writeup and respond a bit 
later--maybe mid-June, if that is okay with you.


On Sunday, May 31, 2020 at 1:09:24 AM UTC+5:30, Mark Montgomery wrote:
> Let me introduce myself to the group in this way: I am an Economics 
> professor at Stony Brook University in New York, with a long-time interest 
> in Indian urbanization. I am also keen to see as much as possible of the 
> spatial and socioeconomic detail on urbanization placed in the public 
> domain. Toward that end, colleagues and I have been knitting together the 
> 2001 and 2011 primary census abstracts (PCAs) that the Indian census 
> authorities have made available on the census website and incorporating 
> published data from the District Census Handbooks, all of these at the 
> level of individual settlements with coverage of wards for the PCAs. Our 
> aim is to create an integrated and publicly-accessible database based only 
> on publicly-available sources. As you would know very well, the spatial 
> side of the task is more challenging for 2001 than 2011.
> At the moment, I seek your guidance on the remarkable DataMeet collection 
> of polygons for villages, census towns, and statutory urban centers, to 
> which a number of you have contributed months or even years of effort. I 
> have linked your spatial records to the PCA identifiers (including 
> subdistrict and district) and in the process have come across some issues 
> (mainly concerning the vintages of the maps that were used, and various 
> oddities regarding identifiers) that some of you may know about. My own 
> spatial work uses R, but I am happy to share these results with the group 
> in other spatial formats (for instance, as geojson or geopackage files). 
> The next steps I have in mind are to compare the DataMeet polygons with the 
> often-mentioned Meiyappan et al. (2018) polygons that have been publicly 
> available at the Socioeconomic Data Applications Center (SEDAC) site since 
> 2018, and with a lesser-known but evidently high-quality collection of 2001 
> point coordinates for villages and some hamlets assembled by a University 
> of Tokyo history professor and available on his website.
> I'm attaching a short pdf that explains these three public-domain sources 
> (with links to the SEDAC and Univ. of Tokyo sources, and with a critical 
> review of aspects of those spatial datasets), and which in particular lays 
> out some of the issues I've encountered with the DataMeet collection. (I've 
> yet to get to grips with the Karnataka data for 1991, and with the 
> Rajasthan data that I believe are for 2011 or later.) I would be really 
> grateful for criticism and suggestions!

Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
To view this discussion on the web visit

Reply via email to