[datameet] Re: 'data science' and 'open data'

2014-05-25 Thread sumandro
Dear all,

I agree that 'open data' and 'data science' do not necessarily fit together 
cleanly. That's why I referred to the third term of 'open science' that 
talks about both openness of data and openness of process and techniques of 
working with data, though in a limited context of scientific research.

However, I guess the better idea is to use these two words -- 'open data' 
and 'data science' -- spearately in the DataMeet by line.

Bests,

sumandro
ajantriks.net

On Friday, 23 May 2014 12:58:23 UTC+5:30, sumandro wrote:

 Dear all,

 This might have been discussed earlier but raising it once again. 

 I see DataMeet as equally interested in data science and openness of data, 
 and this has been rather explicit in our discussions and activities during 
 the election season.

 Can the term 'open' come in to the one line description of the group? 

 Maybe we can merge 'open data' and 'open science' and 'data science' and 
 coin 'open data science'?

 Bests,

 sumandro
 ajantriks.net


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] Re: free open public domain football data

2014-05-25 Thread Nikunj Parikh
Thanks a lot man!!

On Thursday, May 22, 2014 11:55:44 AM UTC+5:30, Vaibhav P wrote:

 Now that FIFA world cup is few days apart, I think this might interest 
 people.

 http://openfootball.github.io/

 Regards,
 Vaibhav


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] Re: draft 2 of ECI Letter

2014-05-25 Thread Dilip Damle
HI, 

I think we should use the word ESRI SHAPEFILE  instead of GIS in 
format. 

We could ask them if there is any Unique Identification for the Candidates 
so that we know when a person is contesting from more than one seats. 
Example : There is no way that we can know (from the Data available ) that 
Hon. Mr. Modi who contested from Vadodara and Varanasi is the same 
individual.


On Thursday, May 15, 2014 9:28:49 AM UTC+5:30, Nisha Thompson wrote:

 Hey All,

 I have changed a few things from the first version. I put it up on the 
 wiki for review here:

 http://datameet.org/wiki/odclettertoecidraft

 It would be good to send it now since Elections are over and we can do 
 follow up with them.

 Please add anything that you can think of.  Also if we can make a list of 
 suggested formats to add as an attachment.  I have added Avinash suggested 
 XML.

 If someone could edit the letter to take out all the Americanisms that 
 would be great also :)

 Nisha

 -- 
 Nisha Thompson
 DataMeet.org
 ni...@datameet.org javascript:
 skype: nishaqt
 mobile: 962-061-2245
  

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] draft 2 of ECI Letter

2014-05-25 Thread Anand Chitipothu
On Thu, May 15, 2014 at 9:28 AM, Nisha Thompson ni...@datameet.org wrote:

 Hey All,

 I have changed a few things from the first version. I put it up on the
 wiki for review here:

 http://datameet.org/wiki/odclettertoecidraft

 It would be good to send it now since Elections are over and we can do
 follow up with them.

 Please add anything that you can think of.  Also if we can make a list of
 suggested formats to add as an attachment.  I have added Avinash suggested
 XML.

 If someone could edit the letter to take out all the Americanisms that
 would be great also :)


Couple of important things that are very hard to find are:

- list of assembly constituencies in each parliamentary constituency
- list of polling centers in an assembly constituency
- list of polling stations in each polling center
- name of ward/mandal/taluk for each polling center

The election commission appoints an officer for each polling center. So
they clearly have information about the polling centers and list of polling
stations in each polling center, but this information is not available
anywhere for public consumption.

Couple of the things listed above are available in various places, esp. the
PDF voter lists. It would be very helpful and save a lot of time if this
information is available in a machine readable format.

Anand

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] Re: Data Sharing Guidelines

2014-05-25 Thread Dilip Damle
Hello, 

I think we need to discuss the following 

1. When is the data eligible to go to Repository

There could be several factors here. Mainly cleanliness and completeness. 

2. Place other than Repository for temporary data. 
I think it should surely not be only an attachment to a post here 
Then it becomes difficult to find later
Administrators should decide on suitable place

3. The particular formats itself 

This could vary based on type of data

My observations  is that  for many types of data  Multiple Linked Tables 
serve better than a single CSV file which is more common.
In this case is .mdb acceptable or is there any other open format for 
linked tables.

this could be a long topic...

4. Compressing multiple files in one file 

Unless there is a reason multiple files that go together should be bundled 
in to one file.
This should also be true for repository.

5. About the content itself 

Since multiple people will contribute/edit to data we will have to have 
some rules.
example : when there is a Unique for the data it should always be used 
otherwise combining comparing the data becomes difficult.
( presently I am trying to collate the election results data and find there 
are differences in the different sources especially in the Names of places. 
Will be putting up the collated data in .mdb format in a few days)

On Friday, May 23, 2014 10:06:35 AM UTC+5:30, Nisha Thompson wrote:

 In the discussion guidelines thread Dilip suggested we have some data 
 sharing guidelines and a place to store some of the more casual datasets, 
 people are cleaning up.

 I think its a good idea.

 Can we use this thread as a place to discuss formats, procedure, and a 
 good place to put it.  

 We have a github already set up, we can start with that, maybe create a 
 project called - Data that needs to be cleaned up.  

 Any other suggestions?

 Nisha

 -- 
 Nisha Thompson
 DataMeet.org
 ni...@datameet.org javascript:
 skype: nishaqt
 mobile: 962-061-2245
  

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.