Hi Justin and others, All the very best for this effort.
I want to share about our (Pune chapter's) experiences regarding Maharashtra Villages data. This might also help resolve some questions filed as issues our github repo. To start with, we had quite some mismatch between the shapefile we got and the census 2011 village data (which we are considering as benchmark as it is a unique, well-ordered and government-ratified dataset that is published openly, whereas the shapefile is sourced informally). And we went with 2011 because with all things considered, 2011 is closer to present reality than 2001. After a few attempts by me at "fixing" things, we decided it was best not to mess with the shapes, as we were crossing many point-of-no-returns in the process and the output was looking more like Swiss cheese than a map. Rather, we could just add new columns/attributes to indicate the recommended changes. The agenda shifted to not necessarily fix and make a proper map, but to document what was wrong and where the corrections are needed, and hopefully someday send the feedback upstream and get the appropriate government agency (MRSAC in this case) to fix it. Or at least if we do publish a "fixed" version at some point, we need to have full documentation of exactly what changes have been done so that there is traceability. And so we set up *this tracking sheet <https://docs.google.com/spreadsheets/d/1vryZTdPOXEblEac_zZw_erSOTYqy5qY-rG_V1bRCA8Q/edit?usp=sharing>* . It's a bit chaotic, there are several worksheets there. I won't be able to explain further here and it is a suspended work-in-progress.. . we have set some things up but left the tasks pending for potential volunteers and interns to take up. For those who want to know more about this, please reply with a different subject line or please post your queries on the #pune channel in datameet.slack.com . We did have one immediate requirement of producing a shapefile for *a web interface that Namita was developing* <https://bnamita.github.io/Village_Mapping_v2/> that dynamically combines shapefile and census csv data. We needed the shapefile to have a census code column/attribute that is non-null and unique, to act as the primary key to match with census data. So, I created a version with a new column added where the repeating and null codes were suffixed with serial numbers. (that would render them unmatchable with the census data, but at least they would not interfere with the program). I have *documented the process here <https://docs.google.com/document/d/e/2PACX-1vQO_bAKdtsoC61POlkmRUp32p1NfdxXtNqZ4Rk2gcEJdphPBtyiwKxVSzuFVnZSlN2ShEBcnQffdSL8/pub>* . So my suggestion for the all-states effort, would be to document all the changes, fixes done and keep ways of tracing back. (adding new columns and making changes there is one such way). It's much tougher, but it will help set up a situation where the fixes can be integrated back into official sources rather than being isolated forks. We can be skeptical of that ever happening, but the alternative is to have to repeat the entire exercise post 2021 census or any time the govt agencies republish the shapes. In any case keeping documentation of changes will speed up the next round as mistakes we find now will likely be repeated later. PS: By "we" I am referring to myself, Craig, Devdatta, Namita, Riddhi, Jinda and some more people from Pune who contributed to the process (apologies for missing names!). We did a few meetups and then made a smaller focus group. -- Cheers, Nikhil VJ +91-966-583-1250 Pune, India Website <http://nikhilvj.co.in> DataMeet Pune chapter <https://datameet-pune.github.io/> Self-designed learner at Swaraj University <http://www.swarajuniversity.org> Contribute <https://www.instamojo.com/@nikhilvj/> On Fri, Mar 23, 2018 at 9:13 AM, Justin <justinelliotmey...@gmail.com> wrote: > okay; started to upload the data: > https://github.com/justinelliotmeyers/INDIA_2018_SHAPEFILE_BOUNDARIES > > please give any feedback, poit out errors, what I did wrong, best way to > move forward, etc. This is v1, so I expect issues to exist, and missing > locations, bad spellings, code issues, etc. > > I tried to clean it up as much as possible - in order to get this pushed > out before the weekend I dissolved to a grid to keep geometry size down. We > can review geometry after initial review of attributes. > > I just threw everything in a repo; we can upload to a more formal project > after this first stae, just wanted to start. > > Thanks!!!!! > > -- > Datameet is a community of Data Science enthusiasts in India. Know more > about us by visiting http://datameet.org > --- > You received this message because you are subscribed to the Google Groups > "datameet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to datameet+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.