Hi all, 

With reference to the ECI letter thread I feel it is time to awaken this 
thread again.

On Tuesday, May 27, 2014 8:46:27 AM UTC+5:30, Nisha Thompson wrote:
>
> THanks Dilip,
>
> Those 5 points are right on. I would also add a point about ownership and 
> licensing. 
>
> I think formats is a good conversation we can have. 
>
> comments on the others below
>
>
>  On 26.05.2014 06:50, Dilip Damle wrote:
>>> > Hello,
>>> >
>>> > I think we need to discuss the following
>>> >
>>> > 1. When is the data eligible to go to Repository
>>> >
>>> > There could be several factors here. Mainly cleanliness and 
>>> completeness.
>>>
>> 1) I would like to figure out what is a good threshold of cleanliness and 
> completeness?  I think robust meta data is important for that.
>  
>
>>  >
>>> > 2. Place other than Repository for temporary data.
>>> > I think it should surely not be "only an attachment to a post here"
>>> > Then it becomes difficult to find later
>>> > Administrators should decide on suitable place
>>>
>> A temporary file isn't a bad idea.  Maybe a google drive or drop box run 
> by datameet could do that.  We can put up the tasks for each dataset on the 
> web and ask people to clean up then give access? 
>
>>  >
>>> > 3. The particular formats itself
>>> >
>>> > This could vary based on type of data
>>> >
>>> > My observations  is that  for many types of data  Multiple Linked 
>>> Tables
>>> > serve better than a single CSV file which is more common.
>>> > In this case is .mdb acceptable or is there any other open format for
>>> > linked tables.
>>> >
>>> > this could be a long topic...
>>> >
>>> > 4. Compressing multiple files in one file
>>> >
>>> > Unless there is a reason multiple files that go together should be
>>> > bundled in to one file.
>>> > This should also be true for repository.
>>> >
>>> > 5. About the content itself
>>> >
>>> > Since multiple people will contribute/edit to data we will have to have
>>> > some rules.
>>> > example : when there is a Unique for the data it should always be used
>>> > otherwise combining comparing the data becomes difficult.
>>> > ( presently I am trying to collate the election results data and find
>>> > there are differences in the different sources especially in the Names
>>> > of places. Will be putting up the collated data in .mdb format in a few
>>> > days)
>>>
>> I'm going to think about for a bit but i think standardization is a 
> really important task that requires a larger discussion. 
>
>>  >
>>> > On Friday, May 23, 2014 10:06:35 AM UTC+5:30, Nisha Thompson wrote:
>>> >
>>> >     In the discussion guidelines thread Dilip suggested we have some
>>> >     data sharing guidelines and a place to store some of the more 
>>> casual
>>> >     datasets, people are cleaning up.
>>> >
>>> >     I think its a good idea.
>>> >
>>> >     Can we use this thread as a place to discuss formats, procedure, 
>>> and
>>> >     a good place to put it.
>>> >
>>> >     We have a github already set up, we can start with that, maybe
>>> >     create a project called - Data that needs to be cleaned up.
>>> >
>>> >     Any other suggestions?
>>> >
>>> >     Nisha
>>> >
>>> >     --
>>> >     Nisha Thompson
>>> >     DataMeet.org
>>> >     ni...@datameet.org <javascript:>
>>> >     skype: nishaqt
>>> >     mobile: 962-061-2245
>>> >
>>> > --
>>> > Datameet is a community of Data Science enthusiasts in India. Know more
>>> > about us by visiting http://datameet.org
>>> > ---
>>> > You received this message because you are subscribed to the Google
>>> > Groups "datameet" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> > an email to datameet+u...@googlegroups.com <javascript:>
>>> > <mailto:datameet+u...@googlegroups.com <javascript:>>.
>>> > For more options, visit https://groups.google.com/d/optout.
>>>
>>> --
>>> Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
>>>       Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
>>>    Web & Twitter | http://www.raphael-susewind.de | @RaphaelSusewind
>>>
>>> Please do consider http://www.gnupg.org for encryption (key id A5ED49AE)
>>>
>>> --
>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>> about us by visiting http://datameet.org
>>> ---
>>> You received this message because you are subscribed to the Google 
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to datameet+u...@googlegroups.com <javascript:>.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> -- 
>> Datameet is a community of Data Science enthusiasts in India. Know more 
>> about us by visiting http://datameet.org
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to datameet+u...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>  -- 
>> Datameet is a community of Data Science enthusiasts in India. Know more 
>> about us by visiting http://datameet.org
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to datameet+u...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Nisha Thompson
> DataMeet.org
> ni...@datameet.org <javascript:>
> skype: nishaqt
> mobile: 962-061-2245
>  

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to