IMO the most important things are to have:

- permanently accessible URLs or other references (e.g. DOIs). For this

This is certainly the ideal.  Let's not fall into the trap of not
doing anything until the ideal system is in place.

Data is available now and some of it will probably be lost
before such a system is in place.  It is better to start
encouraging people to archive their data now than let them off the
hook of not having to do something until something in the future

- clear licensing that allows sharing (ideally open data such as CC, and
open source code such as GPL or BSD).

The advantage of open licensing is that if github or archive.org goes

github (just one suggestion that is popular and up and running)
allows the licensing to be clearly specified by the owners of
the data it holds.

bust, long after I have moved on to other interests, other people can
re-host my data and code. I don't see any particularly compelling reason
to gather things into one archive, though it does seem to help in a
community-building kind of sense.


On 17/02/2012 13:53, Derek M Jones wrote:

There are some efforts underway to do this. I'm familiar with
http://datacite.org/ and http://figshare.com. A couple of SE groups
have started data and model problem repositories, such as

Thanks for the links. figshare looks interesting.

The challenge is getting everyone on board. For now, I don't see a
compelling reason to use these places.

People could just as easily use git-hub, https://github.com/
which is used by a lot of researchers to make their code freely
available (git-hub make their money from people paying for hosting
of privately avaialble code).

Your paper "Automated topic naming to support cross-project analysis
of software maintenance activities" is in my pile of interesting ones
to read in more detail. You can read about my own interest in naming
in www.knosof.co.uk/cbook/sent792.pdf

I suspect it won't happen until journals and conferences begin to
insist on it. There is a reason why retraction rates are so low in CS
and SE: no way to reproduce results to confirm.

Cameron Neylon is a good point man on the issues around Science 2.0
and open access (http://cameronneylon.net/)

Neil Ernst

On 2012-02-16, at 7:15, Derek M Jones wrote:


A couple of researchers I have contacted to obtain data
told me that they have either lost it or did not make an
effort to keep it.

Having someplace that people could automatically upload their
data to might help preserve more of it, as well as making
life easier for other by cutting down on search time.

A while back I was asked to prepare an area on the PPIG website
where people could upload data for public consumption (surrounded by
appropriate caveats of course). The data I was preparing for didn't
ever turn up so the area remains hidden, but I can certainly expose
this in some way if people wish to use it.

Derek M. Jones tel: +44 (0) 1252 520 667
Knowledge Software Ltd blog:shape-of-code.coding-guidelines.com
Source code analysis http://www.knosof.co.uk

The Open University is incorporated by Royal Charter (RC 000391), an
exempt charity in England& Wales and a charity registered in Scotland
(SC 038302).

Derek M. Jones                  tel: +44 (0) 1252 520 667
Knowledge Software Ltd          blog:shape-of-code.coding-guidelines.com
Source code analysis            http://www.knosof.co.uk

Reply via email to