Re: [ccp4bb] [RANT] Reject Papers describing non-open source software

Ethan A Merritt Tue, 12 May 2015 14:09:56 -0700

On Tuesday, 12 May, 2015 16:06:51 Douglas Theobald wrote:
> On May 12, 2015, at 3:19 PM, Robbie Joosten <robbie_joos...@hotmail.com> 
> wrote:
> > 
> > I strongly disagree with rejecting paper for any other reasons than
> > scientific ones.
> 
> I agree, but … one of the foundations of science is independent replicability 
> and verifiability.  In practice, for me to be able to replicate and verify 
> your computational analysis and results, I will need to be able to see your 
> source code, compile it myself, and potentially modify it.  These 
> requirements in effect necessitate some sort of open source model, in the 
> broadest sense of the term.  To take one of your examples, the Ms-RSL license 
> — I can’t effectively replicate and verify your results if I’m legally 
> prohibited from compiling and modifying your source code, so the Ms-RSL is 
> out.  
> 
> > A paper describing software should properly describe the
> > algorithms to ensure the reproducibility.
> 
> *Should*.  In practice, we all know (those programmers among us do, anyway) 
> that descriptions of source code do not suffice.


These issues can burn you in multiple ways.
The list of combinations goes on and on and on...


Case 1) The algorithm being described is sound and is valuable

A) The program correctly implements the algorithm described
   i) it is or will be available on request (with or without source)
   ii) it is available to the reviewers at the time the paper is reviewed
      (with or without source)
   iii) you can't have it

B) The program does something other than the published algorithm describes,
   maybe it's buggy, maybe the description is out of date, maybe it
   turns out to be sensitive to hardware or O/S or external libraries
   i) available to users
   ii) available to reviewers
   iii) you can't have it

Case 2) The algorithm being described is unsound or otherwise flawed

A) The program correctly implements it, and therefore is also unsound
   i/ii/iii as above

B) The program itself is sound because what it implements is not 
   actually what is described.
   i/ii/iii as above

What criteria need to be met in order for the work to be "reproducible"?

I would argue that the most stringent test is to be able to 
reimplement the algorithm independently.  This does not require access
to either the original program or source code, but goes way beyond what
is expected of reviewers in our field.

Note that many of the combinations above cannot be distinguished 
easily at the time of review without reimplementing it.
If you can re-run their program to confirm their result, how do you
distinguish between 1A (all OK), and 2B (fortuitously correct program,
but would not be possible to reimplement from the published algorithm)?

An innovative algorithm may be valuable even if the first program
that implements it is crap.   Conversely there are useful and
widely used programs that are based purely on empirical hacks without
being grounded in some overarching theoretical treatment.

        Ethan




> > The source should be available for
> > inspection to ensure the program does what was claimed, for all I care this
> > can be under the Ms-RSL license or just under good-old copyright. The
> > program should preferably be available free for academic users, but if the
> > paper is good you should be able to re-implement the tool if it is too
> > expensive or doesn't exactly do what you want so it isn't entirely
> > necessary. 
> 
> > Making the software open source (in an OSS sense) does not solve any
> > problems that a good description of the algorithms doesn't do well already.
> 
> This is just wildly wrong.  It’s basically impossible to ensure and verify 
> that a “good" description of the algorithm actually corresponds to the source 
> code without seeing, using, and modifying the source.  To take an 
> experimental analogy — my lab has endured several cases where we read a 
> “good" published description of the subcloning and sequencing of some vector, 
> only to find that the detailed published description is wrong when we are 
> given the chance to analyze the vector ourselves.  It happens all the time, 
> and computer code is no different in this respect.  
> 
> > OSS does not guarantee long-term availability, a paper will like outlive the
> > software repository. OSS licenses (not the BSD license) can be so
> > restrictive that you end up having to re-implement the algorithms anyway. So
> > not having an OSS license should not be a reason to reject the paper about
> > the software.
> > 
> > Cheers,
> > Robbie 
> > 
> >> -----Original Message-----
> >> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
> >> James Stroud
> >> Sent: Tuesday, May 12, 2015 20:40
> >> To: CCP4BB@JISCMAIL.AC.UK
> >> Subject: Re: [ccp4bb] [RANT] Reject Papers describing non-open source
> >> software
> >> 
> >> On May 12, 2015, at 12:29 PM, Roger Rowlett <rrowl...@colgate.edu>
> >> wrote:
> >> 
> >>> Was the research publicly funded? If you receive funds from NSF, for
> > example,
> >> you are expected to share and "make widely available and usable" software
> >> and inventions created under a grant (section VI.D.4. of the Award and
> >> administration guide). I don't know how enforceable that clause is,
> > however.
> >> 
> >> The funding shouldn't matter. I suggest that a publication that has the
> > purpose
> >> of describing non-open source software should be summarily rejected by
> >> referees. In other words, the power is in our hands, not the NSF's.
-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
MS 357742,   University of Washington, Seattle 98195-7742

Re: [ccp4bb] [RANT] Reject Papers describing non-open source software

Reply via email to