Hi Justin, 

Sorry for butting in 

The situation you are mentioning could be because there are some villages 
that are Enclaves.
There are enclaves even at state level where some villages Belonging 
(administratively) to state lie completely inside (geographically) and 
surrounded by another State.
I was perplexed by this when I studied the Parliamentary Constituency data 
in detail.
I had put a post on this topic where Mr. Devdatta and Mr. Lele confirmed 
the same. I think in these cases you will have to look at it case to case.

On Friday, July 18, 2014 9:46:31 AM UTC+5:30, Justin Meyers wrote:
>
> Devdatta,
> Yikes!  I was really hoping there was some dataset(s) out there that 
> actually made sense...  Even the census tables from the 
> http://censusindia.gov.in/ have duplicates, and it isn't always clear 
> what record should be used.   Some of the village level data I have seen 
> shows the wrong tehsil code in a central town (lets say the town code is 
> 33333333xxxxxxxx, all the surrounding villages have codes that are 
> 33444444xxxxxxxx).  I have worked with some wild data in the past, but 
> India seems like a nightmare.What it will most likely come down to is that 
> it will make sense that it doesn't make sense... if that makes sense!?!
>
> I think I need to collect my thoughts with all this and re-calibrate. I 
> have a couple ideas, but I tried them and the results didn't make sense (so 
> maybe they are correct (makes sense that it doesn't make sense...!??)).
>
> I'll keep you posted.  If you come up with anything, or additional 
> resources, please let me know.
>
> Cheers!
> Justin
>
> On Thursday, July 17, 2014 11:45:29 PM UTC-4, Devdatta Tengshe wrote:
>>
>> Hi Justin,
>>
>> It's very hard to look at Survey of India Digital Data and preserve your 
>> sanity. As you have found out, the boundaries of different Administrative 
>> levels do not match. There are many reasons for this, and not all of them 
>> are solvable.
>>
>> The boundaries in the PDFs are generalised no doubt, but if one takes 
>> care while digitizing at the correct scale, one shouldn't have much 
>> problems. See the district shapefiles on the github repo. They were made 
>> from a top down procedure. I used the PC boundaries for the country and 
>> state boundaries. The individual district boundaries were made by referring 
>> to these very Census maps, as well as tehesil boundaries. I also used an 
>> custom tool which I have developed, which helps in cutting one polygon 
>> based on another polygon, which tremendously cut down the time I spent on 
>> creating these internal boundaries.So while the district boundaries might 
>> be generalised in some cases, that the best, updated shapefile I know of 
>> today.
>>
>> Having worked with government departments, I have learnt that getting 
>> data itself is a big task.Any data is a boon. And once I get the data, I 
>> don't expect it to match anything else. With this paradigm, the Census maps 
>> are a goldmine for me.
>>
>> Regards,
>> Devdatta
>>
>>
>> On Fri, Jul 18, 2014 at 8:58 AM, Justin Meyers <justinell...@gmail.com> 
>> wrote:
>>
>>> Devdatta,
>>> Thanks for the quick response. I thought the files originated from the 
>>> Survey of India, but wasn't certain.  I started to create a villages 
>>> dataset, but the tehsils do not really align with what the 2001 census 
>>> villages state their respected tehsil parent is...  So I am assuming all of 
>>> the data from the gevernment is a mix bag (spelling may be off, codes may 
>>> be wrong/ outdated, data may be mixed between years).  What a mess!?!?!  As 
>>> per rectifying and creating maps based off the PDFs, I'm not sure I would 
>>> do that.  The lines they have for boundaries are very, very generalized. 
>>>  Also, I tried (a few years ago) to line them up with actual vector data, 
>>> and there is a huge shift (i was using WGS84 vector data, so maybe I should 
>>> have reprojected).
>>>
>>> Maybe it would be best to start top down or bottom up.  So either build 
>>> a dataset from villages up to states or states down to villages.
>>>
>>> Thoughts?  We need some official data though (which seems impossible to 
>>> find...).  But anything is possible, right!?!
>>>
>>> Cheers,
>>> Justin
>>>
>>> On Thursday, July 17, 2014 11:17:33 PM UTC-4, Devdatta Tengshe wrote:
>>>
>>>> Hi Justin,
>>>> I know the euphoria that one has when one has done something new. It's 
>>>> one of the best things in the world.
>>>>
>>>> If the original source you mentioned is Bhuvan, then the files came 
>>>> directly from Survey of India. I have used those files before, and as you 
>>>> mentioned there were only some 2000 Odd features in it.
>>>>
>>>> There are not from any specific era. Some tehsils in the file were 
>>>> created post 2001, while others created in the 90's were not present.
>>>>
>>>> The only exhaustive source I know, is the Census Administrative Atlas. 
>>>> They have maps in PDF format, not in shapefiles, and I had used it to 
>>>> create the district shapefiles which are shared on the datameet github 
>>>> repos.
>>>> Sometimes I feel I should get started on digitizing those pdfs. It 
>>>> shouldn't take more than 40 hours. 
>>>>
>>>> Regards,
>>>> Devdatta Tengshe
>>>>
>>>>
>>>> On Fri, Jul 18, 2014 at 8:27 AM, Justin Meyers <justinell...@gmail.com> 
>>>> wrote:
>>>>
>>>>> Devdatta,
>>>>> Sorry I didn't type that up.  I just finished processing it and was 
>>>>> excited and posted.  The previous file i posted had 2,693 features.  This 
>>>>> file has 2,739 features.  Initially I thought the data was relevant to 
>>>>> 2001, but maybe it is 1991 (I have no metadata, the Indian government 
>>>>> does 
>>>>> not respond to my e-mails (I have sent at least a dozen, but they do not 
>>>>> respond)).  I am not certain of the exact source, it is hosted by the 
>>>>> Bhuvan (who do not respond to emails either....).
>>>>>
>>>>> As per any processing, I took the data and sorted the attributes (it 
>>>>> was a long string all attached as one - so i split it and created the 
>>>>> fields).
>>>>>
>>>>> Any other questions?  If you know of a more current dataset please 
>>>>> post!!
>>>>>
>>>>>
>>>>> On Thursday, July 17, 2014 9:19:41 PM UTC-4, Devdatta Tengshe wrote:
>>>>>
>>>>>> Hi Justin,
>>>>>>
>>>>>> Can you let us know what was the procedure to create this file, and 
>>>>>> this is accurate upto which date?
>>>>>> I'm asking this shapefile has 2739 sub districts, and according to 
>>>>>> the census, there should be 5564.
>>>>>>
>>>>>> Regards,
>>>>>> Devdatta Tengshe
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 17, 2014 at 11:15 PM, Justin Meyers <
>>>>>> justinell...@gmail.com> wrote:
>>>>>>
>>>>>>> https://app.box.com/s/486rvabh3sjviiynbyu4
>>>>>>>
>>>>>>>
>>>>>>> Cheers!
>>>>>>>
>>>>>>> -- 
>>>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>>>> more about us by visiting http://datameet.org
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "datameet" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to datameet+u...@googlegroups.com.
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>  -- 
>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>> more about us by visiting http://datameet.org
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "datameet" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to datameet+u...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>> about us by visiting http://datameet.org
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to datameet+u...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to