Rows which don't have lat-long will get updated as the work progresses in 
the concerned states. The lat-long which are out of extent will not be 
corrected and remain as-is because of errors in the app or simply low GPS 
accuracy. They'll be very few. Once the survey work is complete and 
finalized in all states; it will be mostly usable as-is from omms.nic.in. 
It's snowing right now so survey work is on a hold.

As the department released the data already in a machine readable format, 
doesn't require scraping etc and under Open Data License, please try to 
attribute the original source, link and license in your work and repo (4a/5 
of the GODL).

I have two concerns; primary is to ensure the data collection mechanism and 
assumptions are documented officially and available readily (not just for 
people subscribed to this google group) so that inferences made by people 
on the data are grounded and more useful. The second is to find a mechanism 
for people to cite the original source so that a case can be made in the 
future for releasing other such datasets within the government. 

On my side, it seems the best way to do this is to dedicate a static page 
with FAQs on ommas & further release this data on the data.gov.in. Though 
in true spirit, anyone can host it anywhere subject to Section 4 and 7 of 
GODL.


On Friday, 27 November 2020 at 20:01:35 UTC+5:30 [email protected] wrote:

> Images seem to have been lost. Attaching them.
> Distribution for states.  
>
> [image: pm1.PNG]I
> 22 rows which probably have sub-category mislabeled? 
> [image: pm2.PNG]
>
> On Friday, November 27, 2020 at 9:27:56 AM UTC-5 Pratap Vardhan wrote:
>
>> Thanks Nisar, that's useful to know. I'll update the repo with these 
>> pointers. 
>> What frequency for updates would you suggest (monthly or)? If you prefer, 
>> we can move the repo to datameet org or any other widely accessible one and 
>> collectively edit it.  
>> So, what I meant about coordinates data is, some rows are blank and some 
>> have coordinates but beyond India's extent bounds. I guess they will get 
>> fixed with updates too.
>> Here's the distribution for states.
>>
>> Separately, minor issue perhaps - there are 22 rows which probably have 
>> sub-category mislabeled.
>>  
>> Thanks for the details!
>> On Friday, November 27, 2020 at 12:47:59 AM UTC-5 [email protected] 
>> wrote:
>>
>>> Really beautiful!  
>>>
>>> I'll answer some of the queries you raised on the tweet thread here for 
>>> everyone. 
>>>
>>> You've commented that the data in UK, HP & Nagaland appears erroneous. I 
>>> am assuming because the lat/long is missing. It is so because these states 
>>> are still doing the survey and haven't completed. They may complete it in 
>>> the next few months. Many of the other NE states are also in a similar 
>>> position. Goa and most UTs haven't been onboarded to the scheme yet. For 
>>> the same reason, I would point people towards to original dataset or put in 
>>> a system to update your GH repos regularly.
>>>
>>> Apart from that - I'll re-iterate that the data was collected by 
>>> government rural engineers at the block level. Intention, accuracy and even 
>>> understanding of definitions will vary across blocks/districts and 
>>> especially states. The data serves its primary purpose with these 
>>> assumptions but may lead to misleading statistics if treated as a census 
>>> for cross-geography comparisons. 
>>>
>>>
>>> On Friday, 27 November 2020 at 10:50:23 UTC+5:30 [email protected] 
>>> wrote:
>>>
>>>> I've pulled states csvs to this repo 
>>>> https://github.com/pratapvardhan/rural-facilities-pmgsy. Consolidated 
>>>> India csv is at 
>>>> https://www.kaggle.com/pratapvardhan/770k-geotagged-rural-facilities-in-india-pmgsy
>>>> And, posted a thread of couple of visuals and minor data issues here 
>>>> https://twitter.com/PratapVardhan/status/1332174593877020673
>>>> Would love to hear if you use this data to create something.
>>>>
>>>> On Monday, November 23, 2020 at 2:05:47 AM UTC-5 Arun Ganesh wrote:
>>>>
>>>>> Wonderful Harsh, its amazing to see such rural spatial datasets opened 
>>>>> by the government. The OpenStreetMap India community is looking into the 
>>>>> data to better get a sense of the quality to see how it could be 
>>>>> integrated 
>>>>> with the existing OSM basemap.
>>>>>
>>>>> As a sample have made an interactive map of just the Kerala data if 
>>>>> anyone wants to explore: 
>>>>> https://api.mapbox.com/styles/v1/planemad/ckhrbz1o6038o19piti441hd7.html?fresh=true&title=copy&access_token=pk.eyJ1IjoicGxhbmVtYWQiLCJhIjoiemdYSVVLRSJ9.g3lbg_eN0kztmsfIPxa9MQ#10.53/9.6726/76.524
>>>>>
>>>>> The dataset points are in yellow, the rest are from OSM. Some initial 
>>>>> evaluations in Kerala and Karnataka suggest that the data is pretty good 
>>>>> and spatially accurate within 100m.
>>>>>
>>>>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/d3845bc8-b276-4c26-b3e6-ac221d3b271fn%40googlegroups.com.

Reply via email to