Re: [datameet] Re: Data Sharing Guidelines

2014-05-26 Thread Thejesh GN
Yes. SQLite is good. I blogged about it as a format for open data sometime
back

https://thejeshgn.com/2012/08/23/sqlite-open-format-for-open-data/

May be its time to do SQL for Poets kind of workshop for who are
interested.

--
Thejesh GN ⏚ ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0
On May 26, 2014 8:23 AM, Raphael Susewind li...@raphael-susewind.de
wrote:

 Dear all,

 for complex files, I would suggest SQLite (sqlite.org). It is open,
 scalable, and extremely rich due to SQL queries. I use it for all my
 more complex datasets, interlinked tables, etc...

 My five cents,
 Raphael

 On 26.05.2014 06:50, Dilip Damle wrote:
  Hello,
 
  I think we need to discuss the following
 
  1. When is the data eligible to go to Repository
 
  There could be several factors here. Mainly cleanliness and completeness.
 
  2. Place other than Repository for temporary data.
  I think it should surely not be only an attachment to a post here
  Then it becomes difficult to find later
  Administrators should decide on suitable place
 
  3. The particular formats itself
 
  This could vary based on type of data
 
  My observations  is that  for many types of data  Multiple Linked Tables
  serve better than a single CSV file which is more common.
  In this case is .mdb acceptable or is there any other open format for
  linked tables.
 
  this could be a long topic...
 
  4. Compressing multiple files in one file
 
  Unless there is a reason multiple files that go together should be
  bundled in to one file.
  This should also be true for repository.
 
  5. About the content itself
 
  Since multiple people will contribute/edit to data we will have to have
  some rules.
  example : when there is a Unique for the data it should always be used
  otherwise combining comparing the data becomes difficult.
  ( presently I am trying to collate the election results data and find
  there are differences in the different sources especially in the Names
  of places. Will be putting up the collated data in .mdb format in a few
  days)
 
  On Friday, May 23, 2014 10:06:35 AM UTC+5:30, Nisha Thompson wrote:
 
  In the discussion guidelines thread Dilip suggested we have some
  data sharing guidelines and a place to store some of the more casual
  datasets, people are cleaning up.
 
  I think its a good idea.
 
  Can we use this thread as a place to discuss formats, procedure, and
  a good place to put it.
 
  We have a github already set up, we can start with that, maybe
  create a project called - Data that needs to be cleaned up.
 
  Any other suggestions?
 
  Nisha
 
  --
  Nisha Thompson
  DataMeet.org
  ni...@datameet.org javascript:
  skype: nishaqt
  mobile: 962-061-2245
 
  --
  Datameet is a community of Data Science enthusiasts in India. Know more
  about us by visiting http://datameet.org
  ---
  You received this message because you are subscribed to the Google
  Groups datameet group.
  To unsubscribe from this group and stop receiving emails from it, send
  an email to datameet+unsubscr...@googlegroups.com
  mailto:datameet+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.

 --
 Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
   Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
Web  Twitter | http://www.raphael-susewind.de | @RaphaelSusewind

 Please do consider http://www.gnupg.org for encryption (key id A5ED49AE)

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[datameet] Re: Data Sharing Guidelines

2014-05-25 Thread Dilip Damle
Hello, 

I think we need to discuss the following 

1. When is the data eligible to go to Repository

There could be several factors here. Mainly cleanliness and completeness. 

2. Place other than Repository for temporary data. 
I think it should surely not be only an attachment to a post here 
Then it becomes difficult to find later
Administrators should decide on suitable place

3. The particular formats itself 

This could vary based on type of data

My observations  is that  for many types of data  Multiple Linked Tables 
serve better than a single CSV file which is more common.
In this case is .mdb acceptable or is there any other open format for 
linked tables.

this could be a long topic...

4. Compressing multiple files in one file 

Unless there is a reason multiple files that go together should be bundled 
in to one file.
This should also be true for repository.

5. About the content itself 

Since multiple people will contribute/edit to data we will have to have 
some rules.
example : when there is a Unique for the data it should always be used 
otherwise combining comparing the data becomes difficult.
( presently I am trying to collate the election results data and find there 
are differences in the different sources especially in the Names of places. 
Will be putting up the collated data in .mdb format in a few days)

On Friday, May 23, 2014 10:06:35 AM UTC+5:30, Nisha Thompson wrote:

 In the discussion guidelines thread Dilip suggested we have some data 
 sharing guidelines and a place to store some of the more casual datasets, 
 people are cleaning up.

 I think its a good idea.

 Can we use this thread as a place to discuss formats, procedure, and a 
 good place to put it.  

 We have a github already set up, we can start with that, maybe create a 
 project called - Data that needs to be cleaned up.  

 Any other suggestions?

 Nisha

 -- 
 Nisha Thompson
 DataMeet.org
 ni...@datameet.org javascript:
 skype: nishaqt
 mobile: 962-061-2245
  

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.