Mr. Nagarajan, Appreciate the move that govt. has taken. But, since you are trying to provide the kind of services which are bringing better transparency it is important to adhere to best technical practices.
As in his earlier post Sumant said, the data has to be in electronic form. It totally makes sense on your behalf when you said that you are scanning all the data from 1960s onwards in PDFs and JPGs. But, it would actually make better sense that gradually you/we adapt to publishing the data which is "ACCESSIBLE" in nature. The data that you might provide us in separate scanned versions might not be enough as a stand-alone data-set and people may require different data-sets for their own purpose (which might be analysis of data for 'n' number of causes). And, its not just people, even the govt. departments might want to reuse data and some other time. So, publishing in '*accessible *'* *formats is very important. XML based representations like RDFs are really good methods since they also have potential to link the data with other data sets. I am sure, many of us over here in group are capable enough to help you on that just in case if you wish to seek assistance. You can take a look at how data.gov.uk i.e. UK govt's official data hub, publishes its data and the kind of accessible services it provides, I have personally tried to explore a lot into it and myself worked on scraping old systems having data to convert to more accessible formats. Coming onto the point where you said "*Much of the current work is being dont is MS Word and Excel but there is no systematic way of storing or publishing them*", I think it becomes utter important for you to judge what all fields of a given data set are meaningful enough to be published. If you have say 5000 records, for example, of various schemes that govt is planning to implement but is still as you termed it "*pre-processed*"* *and its status is liable to change with time. Creating an exclusive file tracker could be a way, but rather than that, technically speaking, if every file(in this case file of every record ) is given a separate unique identity on the network (call it a URI) and the data format is accessible (machine understandable and human-comprehensible) than it just needs to be updated on the back-end with its few fields being updated. I mean, its a lot of different means and ways, in which an organization might want to go ahead publishing the data. But, we -the people in this group and elsewhere too, are broadly a merger of 2 perspectives: 1)*responsible citizens: *we want the system to be transparent and the govt must share the data, that you are already planning a move; 2)*Techies: *We want data that is not just on the internet or some grave where we need to dig and read a file and go back home keeping a notes of data in our diaries to memorize, so rather, we want it accessible, we surely can talk on how to make things accessible! regards, Prashant On Tuesday, 28 August 2012 13:01:17 UTC+1, Nagarajan M wrote: > > Very True Mr. Sumant. > > Regarding conversion of physical records to electronic form. Right now > many of the records are being scanned and pdfs and jpgs are being created. > For example the land records are maintained in a central server online for > the past 8 years. All data before that is lying as a physical record dating > back to 1960's is being scanned. Converting the image to text and making > sense of it is a big challenge. > > Much of the current work is being dont is MS Word and Excel but there is > no systematic way of storing or publishing them. While some reports are > consolidated at the central or state govt. level the data can be more > granular and accurate if direct publishing at source is done. It will take > time for such culture to emerge. But we can facilitate it. > > By creating tools that help in pre-processed output that will help in > publication. For example - when you apply for a permission related to land > usage purpose it takes anywhere from 3 months to 6 months if all is well. > Suppose we create a file tracking system that supports the office work and > at the same time holds the data in an accessible machine readable format > and when its plugged to the net, it creates a win win. This system can be > used by any individual office which wants to do it....say a > nagarpalika,village panchayat,a department,govt undertakings you name it. > > What if we aggregate all such data on a central repository with automatic > sync....your wish for unaltered data could actually come true. > > Let me know what you think? > > Regards, > > Nagarajan > > > On Tue, Aug 28, 2012 at 2:41 PM, Sumant Suresh Kulkarni < > [email protected] <javascript:>> wrote: > >> Dear Nagarajan, >> >> It is great to see the interest in opening up the government data. That >> would certainly help us understand lot of things. I'm trying to answer both >> of your questions. >> >> *1. how a district collector can open up his office data and make it >> accessible?* >> There can be data of two kinds, (a) Data in ledger and (b) Data available >> in electronic form. Opening up of the data depends on what form the data is >> available currently. >> >> (a) The data present in just ledgers (not as e-copy) can be difficult to >> share. To share it, it has to be converted into electronic from. A XML >> based representation of a database representation of the data can be >> created. Then the XML/database dump's can be made available to download by >> putting them on some sites. Even though this is a high effort task, it is >> worth doing as the* data analysis can give many insights* to the people >> and administrators as well. >> >> (b) The data present in electronic form (data entered in computer) is in >> some proprietary softwares provided to governments. Many such software do >> provide options to export the data into files (csv or XML). Once the data >> is exported to these formats, you can host these on some website and let >> people download them. >> >> *2. Would you be happy to analyse data sets on free houses to poor, >> social benefits data, NREGA works data?* >> We would be really ahppy to analyse these datasets, if they are provided >> to us. I believe that there are quite a few people with similar interest to >> analyse such data sets. The analysis, for sure will give some totally not >> obvious insights, which can, hopefully, be helpful to implement the schemes >> better. However, the insights will be as good as the data we get. If we get >> unaltered, complete data, we might be able to give better insights about it. >> >> Regards, >> SUmant >> >> >> >> >> On Mon, Aug 27, 2012 at 4:28 PM, Nagarajan M >> <[email protected]<javascript:> >> > wrote: >> >>> Friends, >>> >>> I recently joined the datameet group and following emails. >>> >>> I am glad that you are considering exploring the Govt initiatives. As an >>> officer in Government I will be happy to get the information required by >>> the group to help you understand and contribute. >>> >>> I have some initial questions to work on in the previous mail. If there >>> are anymore things I can help on I will gladly try. >>> >>> Apart from release of data sets we can try to build tools that enable >>> opening up data at department and office levels. For example how a >>> district collector can open up his office data and make it accessible. What >>> tools he will need. >>> >>> Would you be happy to analyse data sets on free houses to poor, social >>> benefits data, NREGA works data....the list is endless. >>> >>> Opengov can work only if we all work together. >>> >>> Thanks, >>> >>> Nagarajan M, IAS >>> On Aug 27, 2012 3:12 PM, "Nisha Thompson" >>> <[email protected]<javascript:>> >>> wrote: >>> >>>> I’m glad you brought up the NIC’s data.gov.in. >>>> >>>> >>>> >>>> I think we should try to get a Q&A session with them to see if they >>>> can 1) walk us through the new site, 2) allow for us to have a public >>>> place >>>> for feedback and complaints, 3) see if there are any joint projects we can >>>> do with them for their launch (hackathon etc) >>>> >>>> >>>> >>>> This the contact information for them: >>>> >>>> [email protected] <javascript:> >>>> >>>> [email protected] <javascript:> >>>> >>>> >>>> I believe the datasets are supposed to be ready soon. >>>> >>>> >>>> >>>> They are also on this list. So feel free to respond to this mail. >>>> >>>> >>>> >>>> Nisha >>>> >>>> >>>> On Sun, Aug 26, 2012 at 7:55 AM, Ankur Nagar >>>> <[email protected]<javascript:> >>>> > wrote: >>>> >>>>> Speaking of Open Data on India - has anyone peeked into >>>>> http://data.gov.in/ <http://data.gov.in/community/developer> beta? >>>>> (runs of course on OGPL >>>>> http://www.**opengovplatform.org/<http://www.opengovplatform.org/> >>>>> ) >>>>> >>>>> - Ankur >>>>> https://finances.worldbank.org >>>>> @ankur_nagar >>>>> >>>>> >>>>> On Saturday, August 25, 2012 5:08:20 PM UTC-4, Pranesh Prakash wrote: >>>>> >>>>>> Gautam John [2012-08-23 10:53]: >>>>>> > There is also http://ckan.org/ >>>>>> >>>>>> OKF has split CKAN into two: the software <http://ckan.org> and a >>>>>> open data >>>>>> repository / hub <http://thedatahub.org> that runs on CKAN. So >>>>>> you'd want >>>>>> to check out the latter. >>>>>> >>>>>> ~ Pranesh >>>>>> >>>>>> -- >>>>>> Pranesh Prakash · Programme Manager · Centre for Internet and >>>>>> Society >>>>> >>>>> @pranesh_prakash · PGP ID 0x1D5C5F07 · http://cis-india.org >>>>>> >>>>>> -- >>>>> For more details about this list >>>>> http://datameet.org/discussions/ >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "datameet" group. >>>>> To unsubscribe from this group, send email to >>>>> [email protected] <javascript:>. >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Nisha Thompson >>>> Mobile: 962-061-2245 >>>> >>>> >>>> -- >>>> For more details about this list >>>> http://datameet.org/discussions/ >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "datameet" group. >>>> To unsubscribe from this group, send email to >>>> [email protected] <javascript:>. >>>> >>>> >>>> >>> -- >>> For more details about this list >>> http://datameet.org/discussions/ >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "datameet" group. >>> To unsubscribe from this group, send email to >>> [email protected] <javascript:>. >>> >>> >>> >> >> >> >> -- >> Regards, >> Sumant >> >> -- >> For more details about this list >> http://datameet.org/discussions/ >> --- >> You received this message because you are subscribed to the Google Groups >> "datameet" group. >> To unsubscribe from this group, send email to >> [email protected] <javascript:>. >> >> >> > > > > -- > With Best Regards, > > Nagarajan M, IAS > Asst. Collector > Tharad > District Banaskantha > Gujarat > M : 099132 71733 > > -- For more details about this list http://datameet.org/discussions/ --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group, send email to [email protected].
