Devdatta,
I lined up a map (spent 5 minutes, might be able to do it a little better 
if I spend more time) for Orissa.  I used a district shapefile from their 
government (or potentially the SOI), I believe it has a custom projection. 
 They had it up a few weeks ago, then the website disappeared...  I 
rectified, then used a unsupervised classification, then vectorized.  I 
haven't gone in and cleaned up the data, but do you think this would be 
worthwhile developing?  I would rather have software do all the work than 
actually tracing lines myself - Let me know your thoughts.

Data is here: https://app.box.com/s/lfeg76yxkcqpyixojorg

Justin

On Friday, July 18, 2014 12:16:31 AM UTC-4, Justin Meyers wrote:
>
> Devdatta,
> Yikes!  I was really hoping there was some dataset(s) out there that 
> actually made sense...  Even the census tables from the 
> http://censusindia.gov.in/ have duplicates, and it isn't always clear 
> what record should be used.   Some of the village level data I have seen 
> shows the wrong tehsil code in a central town (lets say the town code is 
> 33333333xxxxxxxx, all the surrounding villages have codes that are 
> 33444444xxxxxxxx).  I have worked with some wild data in the past, but 
> India seems like a nightmare.What it will most likely come down to is that 
> it will make sense that it doesn't make sense... if that makes sense!?!
>
> I think I need to collect my thoughts with all this and re-calibrate. I 
> have a couple ideas, but I tried them and the results didn't make sense (so 
> maybe they are correct (makes sense that it doesn't make sense...!??)).
>
> I'll keep you posted.  If you come up with anything, or additional 
> resources, please let me know.
>
> Cheers!
> Justin
>
> On Thursday, July 17, 2014 11:45:29 PM UTC-4, Devdatta Tengshe wrote:
>>
>> Hi Justin,
>>
>> It's very hard to look at Survey of India Digital Data and preserve your 
>> sanity. As you have found out, the boundaries of different Administrative 
>> levels do not match. There are many reasons for this, and not all of them 
>> are solvable.
>>
>> The boundaries in the PDFs are generalised no doubt, but if one takes 
>> care while digitizing at the correct scale, one shouldn't have much 
>> problems. See the district shapefiles on the github repo. They were made 
>> from a top down procedure. I used the PC boundaries for the country and 
>> state boundaries. The individual district boundaries were made by referring 
>> to these very Census maps, as well as tehesil boundaries. I also used an 
>> custom tool which I have developed, which helps in cutting one polygon 
>> based on another polygon, which tremendously cut down the time I spent on 
>> creating these internal boundaries.So while the district boundaries might 
>> be generalised in some cases, that the best, updated shapefile I know of 
>> today.
>>
>> Having worked with government departments, I have learnt that getting 
>> data itself is a big task.Any data is a boon. And once I get the data, I 
>> don't expect it to match anything else. With this paradigm, the Census maps 
>> are a goldmine for me.
>>
>> Regards,
>> Devdatta
>>
>>
>> On Fri, Jul 18, 2014 at 8:58 AM, Justin Meyers <justinell...@gmail.com> 
>> wrote:
>>
>>> Devdatta,
>>> Thanks for the quick response. I thought the files originated from the 
>>> Survey of India, but wasn't certain.  I started to create a villages 
>>> dataset, but the tehsils do not really align with what the 2001 census 
>>> villages state their respected tehsil parent is...  So I am assuming all of 
>>> the data from the gevernment is a mix bag (spelling may be off, codes may 
>>> be wrong/ outdated, data may be mixed between years).  What a mess!?!?!  As 
>>> per rectifying and creating maps based off the PDFs, I'm not sure I would 
>>> do that.  The lines they have for boundaries are very, very generalized. 
>>>  Also, I tried (a few years ago) to line them up with actual vector data, 
>>> and there is a huge shift (i was using WGS84 vector data, so maybe I should 
>>> have reprojected).
>>>
>>> Maybe it would be best to start top down or bottom up.  So either build 
>>> a dataset from villages up to states or states down to villages.
>>>
>>> Thoughts?  We need some official data though (which seems impossible to 
>>> find...).  But anything is possible, right!?!
>>>
>>> Cheers,
>>> Justin
>>>
>>> On Thursday, July 17, 2014 11:17:33 PM UTC-4, Devdatta Tengshe wrote:
>>>
>>>> Hi Justin,
>>>> I know the euphoria that one has when one has done something new. It's 
>>>> one of the best things in the world.
>>>>
>>>> If the original source you mentioned is Bhuvan, then the files came 
>>>> directly from Survey of India. I have used those files before, and as you 
>>>> mentioned there were only some 2000 Odd features in it.
>>>>
>>>> There are not from any specific era. Some tehsils in the file were 
>>>> created post 2001, while others created in the 90's were not present.
>>>>
>>>> The only exhaustive source I know, is the Census Administrative Atlas. 
>>>> They have maps in PDF format, not in shapefiles, and I had used it to 
>>>> create the district shapefiles which are shared on the datameet github 
>>>> repos.
>>>> Sometimes I feel I should get started on digitizing those pdfs. It 
>>>> shouldn't take more than 40 hours. 
>>>>
>>>> Regards,
>>>> Devdatta Tengshe
>>>>
>>>>
>>>> On Fri, Jul 18, 2014 at 8:27 AM, Justin Meyers <justinell...@gmail.com> 
>>>> wrote:
>>>>
>>>>> Devdatta,
>>>>> Sorry I didn't type that up.  I just finished processing it and was 
>>>>> excited and posted.  The previous file i posted had 2,693 features.  This 
>>>>> file has 2,739 features.  Initially I thought the data was relevant to 
>>>>> 2001, but maybe it is 1991 (I have no metadata, the Indian government 
>>>>> does 
>>>>> not respond to my e-mails (I have sent at least a dozen, but they do not 
>>>>> respond)).  I am not certain of the exact source, it is hosted by the 
>>>>> Bhuvan (who do not respond to emails either....).
>>>>>
>>>>> As per any processing, I took the data and sorted the attributes (it 
>>>>> was a long string all attached as one - so i split it and created the 
>>>>> fields).
>>>>>
>>>>> Any other questions?  If you know of a more current dataset please 
>>>>> post!!
>>>>>
>>>>>
>>>>> On Thursday, July 17, 2014 9:19:41 PM UTC-4, Devdatta Tengshe wrote:
>>>>>
>>>>>> Hi Justin,
>>>>>>
>>>>>> Can you let us know what was the procedure to create this file, and 
>>>>>> this is accurate upto which date?
>>>>>> I'm asking this shapefile has 2739 sub districts, and according to 
>>>>>> the census, there should be 5564.
>>>>>>
>>>>>> Regards,
>>>>>> Devdatta Tengshe
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 17, 2014 at 11:15 PM, Justin Meyers <
>>>>>> justinell...@gmail.com> wrote:
>>>>>>
>>>>>>> https://app.box.com/s/486rvabh3sjviiynbyu4
>>>>>>>
>>>>>>>
>>>>>>> Cheers!
>>>>>>>
>>>>>>> -- 
>>>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>>>> more about us by visiting http://datameet.org
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "datameet" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to datameet+u...@googlegroups.com.
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>  -- 
>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>> more about us by visiting http://datameet.org
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "datameet" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to datameet+u...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>> about us by visiting http://datameet.org
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to datameet+u...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to