I've added a note on top with source, license and last updated. Is there an recommended citation you'd like me to place? (also, feel free to submit a PR of what you think might keep the reading more informed.) https://github.com/pratapvardhan/rural-facilities-pmgsy#rural-facilities-geo-tagged-dataset-pmgsy
On Friday, November 27, 2020 at 4:01:15 PM UTC-5 [email protected] wrote: > Rows which don't have lat-long will get updated as the work progresses in > the concerned states. The lat-long which are out of extent will not be > corrected and remain as-is because of errors in the app or simply low GPS > accuracy. They'll be very few. Once the survey work is complete and > finalized in all states; it will be mostly usable as-is from omms.nic.in. > It's snowing right now so survey work is on a hold. > > As the department released the data already in a machine readable format, > doesn't require scraping etc and under Open Data License, please try to > attribute the original source, link and license in your work and repo (4a/5 > of the GODL). > > I have two concerns; primary is to ensure the data collection mechanism > and assumptions are documented officially and available readily (not just > for people subscribed to this google group) so that inferences made by > people on the data are grounded and more useful. The second is to find a > mechanism for people to cite the original source so that a case can be made > in the future for releasing other such datasets within the government. > > On my side, it seems the best way to do this is to dedicate a static page > with FAQs on ommas & further release this data on the data.gov.in. Though > in true spirit, anyone can host it anywhere subject to Section 4 and 7 of > GODL. > > > On Friday, 27 November 2020 at 20:01:35 UTC+5:30 [email protected] wrote: > >> Images seem to have been lost. Attaching them. >> Distribution for states. >> >> [image: pm1.PNG]I >> 22 rows which probably have sub-category mislabeled? >> [image: pm2.PNG] >> >> On Friday, November 27, 2020 at 9:27:56 AM UTC-5 Pratap Vardhan wrote: >> >>> Thanks Nisar, that's useful to know. I'll update the repo with these >>> pointers. >>> What frequency for updates would you suggest (monthly or)? If you >>> prefer, we can move the repo to datameet org or any other widely accessible >>> one and collectively edit it. >>> So, what I meant about coordinates data is, some rows are blank and some >>> have coordinates but beyond India's extent bounds. I guess they will get >>> fixed with updates too. >>> Here's the distribution for states. >>> >>> Separately, minor issue perhaps - there are 22 rows which probably have >>> sub-category mislabeled. >>> >>> Thanks for the details! >>> On Friday, November 27, 2020 at 12:47:59 AM UTC-5 [email protected] >>> wrote: >>> >>>> Really beautiful! >>>> >>>> I'll answer some of the queries you raised on the tweet thread here for >>>> everyone. >>>> >>>> You've commented that the data in UK, HP & Nagaland appears erroneous. >>>> I am assuming because the lat/long is missing. It is so because these >>>> states are still doing the survey and haven't completed. They may complete >>>> it in the next few months. Many of the other NE states are also in a >>>> similar position. Goa and most UTs haven't been onboarded to the scheme >>>> yet. For the same reason, I would point people towards to original dataset >>>> or put in a system to update your GH repos regularly. >>>> >>>> Apart from that - I'll re-iterate that the data was collected by >>>> government rural engineers at the block level. Intention, accuracy and >>>> even >>>> understanding of definitions will vary across blocks/districts and >>>> especially states. The data serves its primary purpose with these >>>> assumptions but may lead to misleading statistics if treated as a census >>>> for cross-geography comparisons. >>>> >>>> >>>> On Friday, 27 November 2020 at 10:50:23 UTC+5:30 [email protected] >>>> wrote: >>>> >>>>> I've pulled states csvs to this repo >>>>> https://github.com/pratapvardhan/rural-facilities-pmgsy. Consolidated >>>>> India csv is at >>>>> https://www.kaggle.com/pratapvardhan/770k-geotagged-rural-facilities-in-india-pmgsy >>>>> And, posted a thread of couple of visuals and minor data issues here >>>>> https://twitter.com/PratapVardhan/status/1332174593877020673 >>>>> Would love to hear if you use this data to create something. >>>>> >>>>> On Monday, November 23, 2020 at 2:05:47 AM UTC-5 Arun Ganesh wrote: >>>>> >>>>>> Wonderful Harsh, its amazing to see such rural spatial datasets >>>>>> opened by the government. The OpenStreetMap India community is looking >>>>>> into >>>>>> the data to better get a sense of the quality to see how it could be >>>>>> integrated with the existing OSM basemap. >>>>>> >>>>>> As a sample have made an interactive map of just the Kerala data if >>>>>> anyone wants to explore: >>>>>> https://api.mapbox.com/styles/v1/planemad/ckhrbz1o6038o19piti441hd7.html?fresh=true&title=copy&access_token=pk.eyJ1IjoicGxhbmVtYWQiLCJhIjoiemdYSVVLRSJ9.g3lbg_eN0kztmsfIPxa9MQ#10.53/9.6726/76.524 >>>>>> >>>>>> The dataset points are in yellow, the rest are from OSM. Some initial >>>>>> evaluations in Kerala and Karnataka suggest that the data is pretty good >>>>>> and spatially accurate within 100m. >>>>>> >>>>> -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/datameet/9ef704a9-ab9e-4226-9e63-ab2aabb6f08dn%40googlegroups.com.
